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DESCRIPTION 

NOVEL TRYPSIN FAMILY SERINE PROTEASES 

5 Technical Field 

The present invention relates to novel trypsin-f amily serine 
proteases, the genes encoding them, and the production and uses 
thereof. 

10 Background Art 

In the testis, the male reproductive organ, sperm, i.e. male 
gametes, are primarily formed through the following three-step 
process: (1) the self-reproduction of spermatogonium as the 
germ-line stem cell and the initiation of differentiation thereof 

15 to the sperm, (2) meiotic division of spermatocyte and the associated 
gene recombination, and (3) morphogenesis of the haploid spermatid 
to the sperm. The sperms formed in this manner are expelled into 
a female body by coitus, pass along the oviduct, and bind to an egg, 
the female gamete, to achieve fertilization (Yomogida, K. and 

20 Nishimune, Y. (1998) Protein, Nucleic acid and Enzyme, 511-521) . To 
achieve fertilization, it is necessary for a sperm to move through 
the oviduct, adhere to and penetrate the zona pellucida on the egg 
surface, and then fuse with the egg. 

A variety of proteases participate in these steps of the 

25 fertilization process. For example, an analysis using knockout mice 
(Krege, J.H. et al . (1995) Nature 375: 146-148; Esther Jr, C.R. et 
al. (1996) Lab. Invest. 74: 953-965) has revealed that sperm 
angiotensin-converting enzyme (testis ACE) plays an important role 
in the process of sperm transportation within the oviduct (Hagaman, 

30 J.R. et al. (1998) Proc . Natl. Acad. Sci. USA 95: 2552-2557) . 
Fertilizing ability is markedly reduced in the male knockout mice 
that lack proprotein convertase 4 (PC4) (M . Mbikay et al . (1997) Proc. 
Natl. Acad. Sci. USA, 94: 6842-6846). 

Regarding serine proteases, a variety of trypsin inhibitors 

35 inhibit in vitro fertilization, suggesting that trypsin-like serine 
proteases present in the sperm (the acrosome in particular) may 


2 

digest the zona pellucida when the sperm penetrates the zona 
pellucida (Saling, P.M. (1981) Proc . Natl. Acad. Sci. USA , 78: 
6231-6235; Benau , D.A. and Storey, B.T. (1987) Biol. Reprod., 36: 
2 82-2 92 ; Liu D. Y. and Baker, H.W. (1993) Biol . Reprod. , 48: 340-348) . 
5 Previously, acrosin, a trypsin-f amily serine protease in the 
acrosome, was assumed to play this role (Brown, C.R. (1983) J. Reprod. 
Fertil . , 69: 289-295; Kremling, H. et al . (1991) Genomics, 11: 
828-834; Klemm, U. et al., (1990) Differentiation, 42: 160-166). 
However, acrosin knockout mice have been shown to have almost normal 

10 fertilizing ability, suggesting that other serine proteases which 
are present in the sperm, apart from acrosin, digest zona pellucida 
(Baba, T. etal. (1994) J. Biol. Chem., 269: 31845-31849; Adham, I.M. 
et al. (1997) Mol . Reprod. Dev., 46: 370-376). In ascidians, a 
trypsin-f amily serine protease, called spermosin, is expressed in 

15 the sperm (Sawada, H. etal. (1984) J. Biol. Chem., 259: 2900-2904). 
An antibody specific to this protease has been shown to inhibit 
fertilization in ascidians in a concentration-dependent manner 
(Sawada, H. et al . , (1996) Biochem. Biophys . Res. Commun . , 222: 
499-504) . Recently, cDNAs of the trypsin-f amily serine proteases, 

20 TESP1 and TESP2 , which are expressed specifically in mouse acrosome, 
were cloned (Kohno, N. etal., (1998) Biochem. Biophys. Res. Commun., 
245: 658-665) . However, the roles these genes play in the 
fertilization process remains to be clarified. Moreover, serine 
proteases existing in the sperm and capable of digesting the zona 

25 pellucida have not yet been reported. 

Disclosure of the Invention 

An objective of the present invention is to provide novel 

trypsin-f amily serine proteases associated with spermatogenesis and 
30 sperm functions, the genes encoding these proteases and a production 

method and use thereof. 

The present inventors attempted to amplify a gene designated 

as 76A5sc2 by polymerase chain reaction, and eventually found a gene 

fragment having a nucleotide sequence different from that of 76A5sc2 
35 gene. Using this gene fragment, the present inventors have cloned 

the cDNAs containing entire open reading frames (ORF) of two novel 


trypsin-f amily serine proteases ("Tespec PRO-1" and "Tespec PRO-2") 
expressed specifically in adult mouse testis. They have also 
analyzed the tissue-specific expression of these genes . 

"Tespec PRO-1" (Testis specific expressed serine proteinase-1 ) 
is predicted to encode 321 amino acids . The deduced amino acid 
sequence contains trypsin-f amily serine protease 

motifs, "Trypsin-His" and "Trypsin-Ser" active sites, and exhibits 
significantly high homology to other trypsin-f amily serine proteases, 
such as acrosin, prostasin, trypsin and so on, in the regions of the 
two motifs and their neighboring regions. In the other regions, 
however, there are no known genes found to exhibit significant 
homology to this protein at the nucleotide or amino acid level. The 
foregoing demonstrates that this protein is a novel trypsin-f amily 
serine protease. 

On the other hand, "Tespec PRO-2" is predicted to encode 319 
amino acids. The protein has a "Trypsin-His" active site. With 
regard to the "Trypsin-Ser" active site, which consists of 12 amino 
acids, it is differs from that of the canonical motif by two amino 
acid residues. Such a difference is found in some other known 
trypsin- family serine proteases, and, thus, "Tespec PRO-2" is 
predicted to function as a protease. There are no known genes found 
to exhibit significant homology to "Tespec PRO-2" at the nucleotide 
and amino acid levels. Thus this protein is also a novel 
tryps in-family serine protease. 

Interestingly, for "Tespec PRO-2", a splicing isoform was found 
that comprises the first half region of "Tespec PRO-2" connected to 
the latter half region of "Tespec PRO-1". This suggests that these 
two proteases are located very close to each other on the chromosome . 
Though a variety of splicing isoforms are found for "Tespec PRO-2", 
these "Tespec PRO-2" isoforms do not retain a long stretch of ORF , 
and thus do not encode any proteases at all. The homology between 
"Tespec PRO-1" and "Tespec PRO-2" is 52.2% at the nucleotide level 
and 33.1% at the amino acid level. 

The present inventors have also successfully cloned a cDNA for 
human "Tespec PRO-2" by RT-PCR and RACE, based on the nucleotide 
sequence of mouse "Tespec PRO-2". Human "Tespec PRO-2" has been 


revealed to have 74.2% and 69.8% homology with mouse "Tespec PRO-2" 
at the nucleotide and amino acid levels, respectively. Further it 
has been clarified that human "Tespec PRO- 2" is encoded on chromosome 
8 . 

The present inventors have further succeeded in cloning a cDNA 
encoding human "Tespec PRO-3" by RT-PCR and RACE, based on the 
nucleotide sequence of mouse "Tespec PRO-1". In addition, they also 
succeeded in cloning a cDNA that encodes mouse "Tespec PRO-3", a mouse 
counterpart to human "Tespec PRO-3". 

Northern blot analysis using the coding region for "Tespec 
PRO-1" as a probe revealed that this gene is expressed merely in adult 
mouse testis, but it failed to identify the expression in other 
tissues or in the fetal stage. Likewise, RT-PCR analysis also showed 
that expression of "Tespec PRO-1" is distinctly high in the adult 
testis. In addition, "Tespec PRO-1" was verified to have increased 
expression in the testis of 18 day-old mice or older, but it was not 
expressed in the testis of 12 day-old mice or younger or in the 
spermatogenesis-def ect mutant mice. Similar analysis was carried 
out for "Tespec PRO-2" and revealed that expression pattern of this 
gene is identical to that of "Tespec PRO-1". These findings suggest 
that both "Tespec PRO-1" and "Tespec PRO-2" are involved in sperm 
differentiation and maturation, and/or sperm function 
(fertilization) . It should be noted that trypsin-f amily serine 
proteases have been suggested to play important roles in 
fertilization. 

Thus, the present inventors conclude that the proteins encoded 
by the isolated genes are likely serine proteases that play crucial 
roles in fertilization. Accordingly, they may be useful for 
developing new therapeutic or diagnostic agents for sterility, 
and/or for developing new contraceptives. 

The present invention relates to novel trypsin-f amily serine 
proteases thought to be associated with spermatogenesis or sperm 
functions, the genes encoding them, production methods and the uses 
thereof. More specifically, the present invention provides: 

1 . a protein comprising the amino acid sequence selected from 
the group consisting of SEQ ID NO: 2 , SEQ ID NO: 4, SEQ ID NO: 6, 
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SEQ ID NO: 8, and SEQ ID NO: 10; 

2. a protein functionally equivalent to the protein comprising 
an amino acid sequence selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, and SEQ ID NO: 10, 

5 wherein said protein is selected from the group of (a) and (b) , 
wherein : 

(a) is a protein comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 
SEQ ID NO: 8, and SEQ ID NO: 10, wherein one or more amino acids are 

10 deleted, added, inserted and/or substituted with different amino 
acids ; and 

(b) is a protein encoded by DNA that hybridizes to the DNA 
comprising the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 

15 7, and SEQ ID NO: 9; 

3. a partial peptide of the protein according to any one of (1) 
and (2) ; 

4. a fusion protein comprising the first protein according to 
any one of (1) and (2) , fused with a second peptide; 

20 5. a DNA molecule encoding the protein according to any one of 

(1) to (3) ; 

6. a vector into which the DNA according to (5) is inserted; 

7. a trans formant having the DNA according to (5) in an 
expressible form; 

25 8. a method for producing the protein according to any one of 

(1) to (3) , said method comprising the steps of: culturing the 
trans formant according to (7) , and recovering the expressed protein 
from the transformant or the culture supernatant thereof; 

9. a method of screening for a substrate of the protein 

30 according to any of (1) and (2) , wherein the method comprises the 
following steps of : 

(a) contacting a test sample with said protein; 

(b) detecting the protease activity of said protein against the 
test sample; and 

35 .(c) selecting a compound that is digested or cleaved by said 

protease activity; 
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10 . a substrate of the. protein according to any of (1) and (2) , 
wherein said substrate can be isolated by the method according to 
(9) ; 

11. a method of screening for a compound capable of inhibiting 
5 the activity of the protein according to any of (1) and (2) , said 

method comprising the following steps of: 

(a) contacting the protein with the substrate of (10) in the 
presence of a test sample; 

(b) detecting the protease activity of the protein against the 
10 substrate; and 

(c) selecting a compound that reduces the protease activity 
relative to the protease activity detected in the absence of the test 
sample; 

12. a compound that inhibits the activity of the protein 
15 according to any of (1) and (2) , wherein said compound can be isolated 

by the method according to (11) ; 

13. an antibody that binds to the protein according to any of 
(1) and (2) ; 

14. a method for detecting or assaying the protein according 
20 to any of (1) and (2) , said method comprising the steps of: contacting 

the antibody according to (13) with a test sample that is anticipated 
to contain the protein; and detecting or assaying formation of the 
immune-complex between the antibody and the protein; and 

15. a nucleotide sequence specifically hybridizing to the DNA 
25 comprising the nucleotide sequence selected from the group 

consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO : 
7, and SEQ ID NO: 9, wherein the nucleotide sequence is at least 15 
nucleotide in length. 

The present invention provides novel trypsin-f amily serine 
30 proteases. Of the proteins provided in the present invention, the 
amino acid sequence of the mouse protein designated "Tespec PRO-1" 
is shown in SEQ ID NO: 2, the amino acid sequences of the mouse and 
human proteins designated "Tespec PRO-2" are shown in SEQ ID NO: 4 
and SEQ ID NO: 6, respectively, and the amino acid sequences of the 
35 mouse and human proteins designated "Tespec PRO-3" are shown in SEQ 
ID NO: 8 and SEQ ID NO: 10, respectively. Nucleotide sequences of 


the cDNA encoding these proteins are shown in SEQ ID NO: 1, SEQ ID 
NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, and SEQ ID NO: 9, respectively. 

A high level of expression of . the proteins of the present 
invention "Tespec PRO-1" and "Tespec PRO-2" were observed in the 
mouse testis (Examples 5 and 6) . When these proteins are localized 
in the sperm, particularly in the acrosome region, they may function 
as key proteases for sperm to achieve fertilization by digesting the 
zona pellucida. Thus, the proteins of the present invention may be 
useful for developing new therapeutic and diagnostic agents for 
sterility or for developing new contraceptives. 

The present invention also encompasses proteins that are 
functionally equivalent to mouse "Tespec PRO-1", mouse "Tespec 
PRO-2", human "Tespec PRO-2", mouse "Tespec PRO-3", or human "Tespec 
PRO-3" protein. As used herein, the term "functionally equivalent" 
refers to the retention of biological properties equivalent to mouse 
"Tespec PRO-1", mouse "Tespec PRO-2", human "Tespec PRO-2", mouse 
"Tespec PRO-3", or human "Tespec PRO-3" protein. Illustrative 
biological properties include, but are not limited to, for example, 

(i) trypsin-f amily serine protease activity as an activity property, 

(ii) trypsin-f amily serine protease motifs ("Trypsin-His" (PROSITE 
PS00134) , "Trypsin-Ser" (PROSITE PS00135) ) and/or similar sequences 
thereof, as well as significant homology to the amino acid sequence 
of mouse "Tespec PRO-1" protein, mouse "Tespec PRO-2" protein, human 
"Tespec PRO-2" protein, mouse "Tespec PRO-3" protein, or human 
"Tespec PRO-3" protein as the structural properties of the sequences 

(infra), and (iii) expression in the testis, as the expression 
property. 

Methods for introducing mutations into the amino acid sequence 
of a protein, for example, may be used to obtain such functionally 
equivalent proteins. To obtain a protein into which mutations are 
introduced into its amino acid sequence, methods such as 
site-specific mutagenesis using synthetic oligonucleotide primers 
(Kramer, W. and Fritz, H . J. Methods in Enzymol . , (1987) 154: 350-367) , 
a PCR system for site-specific mutagenesis (GIBCO-BRL) and the 
Kunkel' s method (Methods Enzymol . , (1988) 85: 2763-2766) may be used. 
By these methods, a protein comprising the amino acid sequence of 
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SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ ID 
NO: 10 can be modified to obtain a protein in which one or more amino 
acids in its amino acid sequence have been deleted, added, inserted 
and/or substituted with different amino acids without affecting the 
5 biological properties of the protein. 

There is no particular limitation on the number of amino acids 
that may be mutagenized, as long as the protein retains the biological 
properties of the wild-type protein (comprising the amino acid 
sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8 
10 or SEQ ID NO: 10) . Such mutations include, but are not limited to, 
for example: 

- deletion of one or more amino acids, preferably, 2 to 30, and more 
preferably, 2 to 10 amino acids from any one of the amino acid 
sequences of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 

15 8, and SEQ ID NO: 10; 

- addition of one or more amino acids, preferably, 2 to 30, and more 
preferably, 2 to 10 amino acids into any one of the amino acid 
sequences of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 
8, and SEQ ID NO: 10; and 

20 - substitution of one or more, preferably, 2 to 30, and more 
preferably, 2 to 10 amino acids in any one of the amino acid 
sequences of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6 , SEQ ID NO: 
8, and SEQ ID NO: 10, with different amino acids. 

There is also no particular limitation on the amino acid sites 
25 for mutagenesis, so long as the protein retains the biological 
properties of the wild-type protein comprising any one of the amino 
acid sequences shown in SEQ ID NOs : 2,4, 6, 8 and 10. 

It is known that a protein comprising a modified amino acid 
sequence of another protein wherein one or more amino acid residues 
30 have been deleted, added, and/or substituted with different amino 
acids can maintain its biological activity (Mark, D. F. et al . , Proc. 
Natl. Acad. Sci. USA, (1984) 81: 5662-5666; Zoller, M. J. & Smith, 
M., Nucleic Acids Research, (1982) 10: 6487-6500; Wang, A. et al . , 
Science, 224: 1431-1433; Dalbadie-McFarland, G. et al . , Proc . Natl. 
35 Acad. Sci. USA, (1982) 79 : 6409-6413 ) . 

For example, proteins into which one or more amino acid residues 
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have been added to proteins of the present invention include fusion 
proteins . A fusion protein is a protein made by fusing the protein 
of the present invention with another peptide. A fusion protein can 
be prepared in an artificial manner. For example, the DNA encoding 
5 the protein of the present invention can be ligated in-frame with 
a DNA encoding another peptide, and then introduced into an 
expression vector to express the fusion gene in a host using 
conventional methods . There is no particular restriction on the 
other peptides or proteins to be used for fusion with the protein 

10 of the present invention. Such peptides include , but are not limited 
to, for example, FLAG (Hopp, T.P. et al'., BioTechnology , (1988) 6: 
1204-1210), 6x His consisting of six histidine (His) residues, lOx 
His, influenza virus hemagglutinin (HA) , human c-myc fragments, 
VSV-GP fragments, pl8HIV fragments, T7-tag, HSV-tag, E-tag, SV40T 

15 antigen fragments, lck tag, ot-tubulin fragments, B-tag, Protein C 
fragment, and other well-known peptides. Such proteins include, for 
example, GST (glutathione-S-transf erase) , HA (influenza virus 
hemagglutinin), immunoglobulin constant regions, P-galactosidase , 
MBP (maltose-binding protein) , etc. Commercially available DNAs 

20 encoding these peptides or proteins may also be used to prepare fusion 
proteins . 

Using well-known hybridization technigues (Sambrook, J et al . , 
Molecular Cloning 2nd ed. , 9.47-9.58, Cold Spring Harbor Lab. Press, 
1989) and the DNA encoding the proteins of the present invention (DNA 

25 seguences of SEQ ID NOs : 1, 3, 5, 7 and 9) or a part thereof, one 
skilled in the art can isolate DNA homologous to the original DNA. 
Using the DNA thus obtained, one skilled in the art can routinely 
to obtain a protein functionally eguivalent to the protein of the 
present invention. The present invention includes proteins that are 

30 functionally eguivalent to the proteins of the present invention, 
including those which are encoded by DNA capable of hybridizing to 
the DNA encoding any of the aforementioned proteins of the present 
invention, or a part thereof, under a stringent condition. In the 
isolation of such hybridizable DNA from other organisms, there is 

35 no limitation on the type of organisms; such organisms include, but 
are not limited to, for example, human, mouse, rat, cattle, monkey, 
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pig, etc. In the context of the present invention, the 

term "stringent conditions" typically refers to "42°C, 2x SSC, 0.1% 
SDS" and the like, preferably "50 °C, 2x SSC, 0.1% SDS" and the like, 
and more preferably "65 °C, 2x SSC , 0.1% SDS" and the like. Under 
these conditions, the higher the temperature is set, the higher the 
likelihood that DNA with higher homology will be obtained. 

Proteins encoded by DNA isolated by the above hybridization 
techniques normally have high homology to the amino acid sequence 
of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, or SEQ 
ID NO: 10. In the context of the present invention, the term "high 
homology" typically refers toat least 60% homology, preferably at 
least 70% homology, more preferably at least 80% homology, even more 
preferably at least 95%. The degree of homology between two proteins 
can be determined using the algorithm described in Wilbur, W.J. and 
Lipman, D.J. Proc . Natl. Acad. Sci. USA, (1983) 80: 726-730. 

The proteins of the present invention may differ in amino acid 
sequence, molecular weight, isoelectric point, presence or absence 
of a sugar chain, and form, according to the cells or hosts producing 
the proteins, or to the purification methods. However, as long as 
the obtained proteins retain the biological properties of the 
proteins comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO: 6, SEQ ID NO : 8 or SEQ ID NO: 10, they are included 
in the present invention. 

The protein of the present invention can be a naturally 
occurring protein or can be produced as a recombinant protein, 
utilizing a genetic recombination technique. A naturally occurring 
protein can be prepared, for example, by extracting proteins from 
tissue or cells (for example, testis) in which the proteins of the 
present invention are thought to be present, and then by performing 
affinity chromatography using the antibodies of the present 
invention described below. 

Likewise, for example, to produce a recombinant protein, DNA 
encoding the protein of the present invention is incorporated into 
an expression vector in a manner such that the DNA is expressed under 
the control of expression regulatory regions, such as enhancers and 
promoters, and then transduced into host cells to express the 
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protein. 

Specifically, when mammalian cells are used, DNA corresponding 
to a conventional, useful promoter/enhancer, DNA encoding a protein 
of the present invention, and the poly A signal at the downstream 
5 region of the 3' end of the coding region are functionally linked 
or constructed as a vector containing such DNA. Exemplary 
promoters/enhancers include, but are not limited to, human 
cytomegalovirus immediate early promoter/enhancer. 

Other promoters /enhancers that can be used for protein 

10 expression include, but are not limited to, retroviral, polyomaviral , 
adenoviral and simian virus 40 (SV40) promoters /enhancers , and 
promoters/enhancers derived from mammalian cells, such as that of 
human elongation factor la (HEFlcc) 

This is easily carried out, for example, according to the method 

15 of Mulligan et al . (Nature (1979) 277: 108) when SV40 
promoter/enhancer is used, and to the method of Mizushima et al . 
(Nucleic Acids Res. (1990) 18: 5322) when using HEF10C 
promoter/enhancer is used. 

For a replication origin , those derived from SV40 , polyomavirus , 

20 adenovirus, bovine papilomavirus (BPV) , and the like may be used. 
To increase the copy number of the gene in the host cell, the 
expression vector may optionally contain a selectable marker, such 
as an aminoglycoside transferase (APH) , thymidine kinase (TK) , 
E.coli xanthine-guanine phosphoribosyl transferase (Ecogpt) , or 

25 dihydrof olate reductase (dhfr) gene , etc . 

When using E. coli, conventional useful promoters, a signal 
sequence for polypeptide secretion, and the gene to be expressed may 
be functionally linked to express the gene. Such promoters include, 
but are not limited to, for example, lacZ and araB promoters . When 

30 the lacZ promoter is used, the method of Ward et al . (Nature (1098) 
341: 544-546; FASEB J. (1992) 6: 2422-2427) can be used. When the 
araB promoter is used, the method of Better et al . (Science (1988) 
240: 1041-1043) may be followed. 

To produce the protein into the periplasm of E. coli, the pelB 

35 signal sequence (Lei, S. P. et al . , J. Bacterid., (1987) 169: 4379) 
may be used as a signal for secretion of the protein. 
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Any expression vector can be used to produce the protein of the 
present invention so long as it is suitable for use with the present 
invention. Such expression vectors include, but are not limited to, 
for example, the adenoviral vector "pAdexLcw" and the retroviral 
5 vector "pZIPneo". Also included are expression vectors derived from 
mammalians, including, but not limited to, for example, pEF and 
pCDM8 ; derived from insects, including, but not limited to, for 
example, pBacPAK8 ; derived from plants, including, but not limited 
to, for example, pMHl and pMH2 ; derived from animal viruses, 

10 including, but not limited to, for example, pHSV, pMV, and pAdexLcw ; 
derived from retroviruses, including, but not limited to, for example, 
pZIpneo; derived from yeast, including, but not limited to, for 
example, pNVll and SP-Q01; derived from Bacillus subtilis, including, 
but not limited to, for example, pPL608 and pKTH50; and derived from 

15 E . coli, including, but not limited to, for example, pQE , pGEAPP , 
pGEMEAPP, pMALp2 and pREP4 . 

In the present invention, any production systems may be used 
to produce the protein. Such production systems for producing the 
protein include in vitro and in vivo production systems. Production 

20 systems using eukaryotic cells or prokaryotic cells may be used as 
in vitro production systems. 

Among the production systems using eukaryotic cells are those 
using animal cells, plant cells, and fungal cells. Such animal cells 
include mammalian cells, such as CHO (J. Exp. Med. (1995) 108: 945) , 

25 COS, myeloma, BHK (baby hamster kidney) , HeLa , and Vero, ; amphibian 
cells, such as Xenopus oocytes (Valle, et al . , Nature, (1981) 291: 
358-340); insect cells, such as sf9, sf21 and Tn5 . Particularly 
preferred are CHO cells, dhfr-CHO, a DHFR-def icient CHO cell (Proc. 
Natl. Acad. Sci. USA, (1980) 77: 4216-4220), and CHO K-l (Proc. Natl. 

30 Acad. Sci. USA, (1968) 60: 1275). 

Nicotiana tabacum-derived cells are plant cells that are well 
known for such use. They can be grown as callus culture. As such 
fungal cells, yeasts, such as the Saccharomyces genus, for example, 
Saccharomyces cerevisiae , filamentous bacteria such as the 

35 Aspergillus genus, for example, Aspergillus niger are known. 

Among the production systems using prokaryotic cells is a 


production system using bacterial cells. Such bacterial cells 
include E . coll and Bacillus subtil Is . , 

These cells are transformed with. the DNA of interest, and the 
transformed cells are then cultured In vitro to obtain the proteins. 
The culture is performed according to conventional methods. For 
eukaryotic cells, culture media, such as'DMEM, MEM, RPMI1640, and 
IMDM, can be used. These media may be used with a serum supplement, 
such as fetal calf serum (FCS) , or used as a serum-free medium. 
Preferably pH of the culture ranges from about 6 to about 8 . The 
culture is usually conducted for about 15 to 200 hours at a 
temperature of about 30 °C to 40 °C, and, if necessary, the medium may 
be changed, aerated, and stirred. 

On the other hand, in vivo production systems include systems 
using animals and plants. The DNA of interest is introduced into 
such a plant or animal, within which the protein is produced, and 
then the protein produced is recovered. As used herein, the term 
"host" encompasses such animals and plants as well. 

The systems using animals include the production systems using 
mammals and insects. Such mammals include, but are hot limited to, 
goats, pigs, sheep, mice, and cattle (Vicki Glaser, SPECTRUM 
Biotechnology Applications, 1993). When mammals are used, 

transgenic animals may be used. For example, the DNA of interest 
is inserted within a gene encoding a protein produced intrinsically 
in milk, such as goat P casein, to prepare a fusion gene. The DNA 
fragment containing the fusion gene in which the DNA of interest is 
inserted injected into a goat embryo, which is then introduced into 
a female goat. The protein is then collected from the milk produced 
from the transgenic goat, that which was born from the goat that had 
accepted the embryo, or descendents thereof. To increase the amount 
of the milk containing the protein that is produced from the 
transgenic goat, suitable hormone (s) may be given to the transgenic 
goats (Ebert, K.M. etal., Bio/Technology, (1994) 12: 699-702). 

Silk worms are useful insects in the context of the present 
invention , When a silk worm is used, it is infected with a baculovirus 
into which the DNA of interest has been inserted, and the desired 
protein is obtained from the body . fluids of the silk worm (Susumu, 
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M. et al., Nature, (1985) 315: 592-594). 

When a plant is used, tobacco, for example, can be used. When 
a tobacco plant is used, the DNA of interest is inserted into a plant 
expression vector, for example pMON 530, which is then introduced 
5 into a bacterium such as Agrobacterium tumefaciens. This bacterium 
is used to infect the tobacco plant, for example Nicotians tabacum, 
to obtain the desired polypeptide from its leaves (Julian, -K.-C. Ma, 
et al., Eur. J. Immunol., (1994) 24: 131-138) . 

The protein of the present invention thus obtained can be 

10 isolated from inside or outside of the cells, or from hosts and 
purified as a substantially pure and homogenous protein. The 
separation and purification of the protein is not limited to any 
particular method, and can be done using conventional methods for 
separation and purification. For example, chromatography columns , 

15 filtration, ultrafiltration, salting out, solvent precipitation, 
solvent extraction, distillation, immunoprecipitation , 

SDS-polyacrylamide gel electrophoresis, isoelectric focusing, 
dialysis, recrystalization and the like may be suitably selected or 
combined to separate/purify the protein. 

20 Such chromatographies include, but are not limited to, for 

example, affinity chromatography, ion exchange chromatography, 
hydrophobic chromatography, gel filtration, reversed-phase 
chromatography, adsorption chromatography, etc. (Strategies for 
Protein Purification and Characterization: A Laboratory Course 

25 Manual. Ed Daniel R. Marshak et al. , Cold Spring Harbor Laboratory 
Press, 1996) . These chromatographies can be done by liquid 
chromatography, such as HPLC, FPLC, etc. The present invention 
encompasses the proteins highly purified by these purification 
methods . 

30 Optionally, by treating with an appropriate modification 

enzyme before or after the proteins are purified, the proteins can 
be modified or their peptides can be partially removed. Such 
modification enzymes include, but are not limited to, trypsin, 
chymotrypsin, lysyl endopeptidase , protein kinase, and glucosidase . 

35 The present invention also comprises partial peptides from the 

proteins of the present invention. Such peptides can be utilized, 
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for example, as immunogens to give antibodies capable of binding to 
the proteins of the present invention. For this purpose, such 
peptides will contain at least 12 amino acid residues, and preferably , 
at least 20 amino acid residues . Partial peptides of the proteins 
5 of the present invention may be produced by genetic engineering 
techniques or using well-known methods for synthesizing peptides, 
or by cleaving the protein of the present invention with a suitable 
peptidase. To synthesize peptides, solid-phase synthesis and 
liquid-phase synthesis may be also used. 

10 A protein of the present invention or a partial peptide thereof 

that is expressed in a host by using a genetic engineering technique 
can be- isolated from the cells or extracellular materials and can 
be purified as a substantially pure and homogeneous protein. There 
is no limitation on the methods of isolation and purification of the 

15 protein; any of the generally used methods for protein purification 
may be used to isolate and purify the protein. Separation and 
purification of the protein can be achieved by properly selecting 
or combining methods including, but not limited to, for example, 
column chromatography, filtration, ultrafiltration, salting out, 

20 solvent precipitation, solvent extraction, distillation, 
immunoprecipitation, SDS-polyacrylamide gel electrophoresis, 
isoelectric focusing, dialysis, and recrystallization . 

Such chromatographies include, but not limited to, for example, 
affinity chromatography, ion exchange chromatography, hydrophobic 

25 chromatography, gel filtration, reversed-phase chromatography, 
adsorption chromatography, etc. (Strategies for Protein 
Purification and Characterization: A Laboratory Course Manual. Ed 
Daniel R. Marshak et al . , Cold Spring Harbor Laboratory Press, 1996) . 
These chromatographies can be done by liquid chromatography, such 

30 as HPLC, FPLC, etc. The present invention encompasses the proteins 
highly purified by these purification methods. 

Optionally, by treating with an appropriate modification 
enzyme before or after the proteins are purif ied , the proteins can 
be modified or their peptides can be partially removed. Such 

35 modification enzymes include trypsin, chymotrypsin , lysyl 
endopeptidase , protein kinase, and glucosidase. 
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Further, the present invention provides for DNA encoding the 
proteins of the present invention mentioned above. The DNA of the 
present invention can be used not only to produce the proteins of 
the present invention in vivo and in vitro , but also for gene therapy 
5 of, for example, mammals (e.g., human) . It is expected that the genes 
of the present invention, in particular, may be applied to the gene 
therapy of infertility. When used in the gene therapy, the DNA of 
the present invention is inserted into a vector and then administered 
to the target sites in the body. The method of administration may 
10 be ex vivo or in vivo. The vectors of the present invention include 
such vectors as used for gene therapy. 

Genomic DNA or cDNA that encodes the protein of the present 
invention may be obtained by screening a genomic library, a cDNA 
library or the like, using a hybridization technique well known to 
15 one skilled in the art. 

By using the obtained DNA or cDNA fragment as a probe, and 
further by screening genomic or cDNA libraries, the genes can be 
obtained from other cells, tissues, organs, or species. Genomic and 
cDNA libraries may be prepared by, for example, the method of Sambrook, 
20 J. et al., Molecular Cloning, Cold Spring Harbor Laboratory Press 
(1989) . Also, commercially available DNA libraries may be used. 

By determining the nucleotide sequence of the obtained cDNA, 
the translatable region encoded by the cDNA can be identified to 
obtain the amino acid sequence of the protein of the present 
25 invention. 

Specifically, this can be done as follows. First, mRNA is 
isolated from cells, tissue, or an organ expressing a protein of the 
present invention. To isolate mRNA, a well-known method , for example, 
guanidine ultracentrif ugation (Chirgwin, J.M. et al ., Biochemistry , 

30 (1979) 18 : 5294-5299) , the AGPC method (Chomczynski , P. and Sacchi , 
N . , Anal. Biochem., (1987) 162: 156-159), is used to. isolate total 
RNA, from which mRNA is purified using mRNA Purification Kit 
(Pharmacia), etc. QuickPrep mRNA Purification Kit (Pharmacia) can 
be used to prepare mRNA directly. 

35 cDNA is synthesized from the obtained mRNA by reverse 

transcriptase. It can be synthesized using the AMV Reverse 


Transcriptase First-strand cDNA Synthesis Kit (SEIKAGAKU KOGYO) , etc. 
Also, it may be synthesized and amplified with the probes set forth 
herein, according to the 5 ' -RACE method (Frohman, M.A. et al . , Proc. 
Natl. Acad. Sci . USA, ■ (1988) 85: 8998-9002; Belyavsky, A. et al . , 
Nucleic Acids Res., (1989) 17: 2919-2932) using the 5'-Ampli FINDER 
RACE KIT (Clontech) and the polymerase chain reaction (PCR) . 

The DNA fragment of interest is prepared from the PCR product 
obtained and ligated with vector DNA. Recombinant vectors are thus 
created, and they are introduced into host cells, such as E.coli. 
Colonies are selected to prepare the desired recombinant vector. The 
nucleotide sequence of the DNA of interest may be verified by a known 
method, for example, the dideoxy nucleotide chain termination 
method. 

The DNA of the present invention can be designed to have a 
sequence with higher expression efficiency, taking into account the 
codon used in the host for the expression (Grantham, R. et al . , 
Nucleic Acids Research, (1981) 9: r43-r74) . Also, the DNA of the 
present invention may be modified using commercially available kits 
or well-known methods. Such modification ( s ) include, but are not 
limited to, for example, digestion with restriction enzymes, 
insertion of synthetic oligonucleotides or suitable DNA fragments, 
addition of linkers, insertion of a start codon (ATG) and/or stop 
codon (TAA, TGA, or TAG) . 

The DNA of the present invention encompasses, for example, the 
DNA comprising the nucleotide sequence extending from A at nucleotide 
48 to C at nucleotide 1010 of the nucleotide sequence set forth in 
SEQ ID NO: 1; the DNA comprising the nucleotide sequence extending 
from A at nucleotide 69 to C at nucleotide 1025 of the nucleotide 
sequence set forth in SEQ ID NO: 3; the DNA comprising the nucleotide 
sequence extending from A at nucleotide 73 to A at nucleotide 867 
of the nucleotide sequence set forth in SEQ ID NO: 5; the DNA 
comprising the nucleotide sequence extending from A at nucleotide 
38 to A at nucleotide 1000 of the nucleotide sequence set forth in 
SEQ ID NO: 7; and the DNA comprising the nucleotide sequence extending 
from A at nucleotide 41 to C at nucleotide 1096 of the nucleotide 
sequence set forth in SEQ ID NO: 9. 
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The DNA of the present invention further encompasses DNA that 
hybridizes under stringent conditions to the. DNA of any of the 
nucleotide sequences of SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, 
SEQ ID NO: 7, and SEQ ID NO: 9, so long as the hybridizing DNA also 
encodes a protein functionally equivalent to. the protein of the 
present invention. 

The "stringent conditions" are typically "42 "C, 2x SSC, 0.1% 
SDS" and the like , preferably "50 ° C , 2x SSC , 0.1% SDS" and the like, 
and more preferably "65°C, 2x SSC, 0.1% SDS" and the like. Under 
these conditions, the higher the temperature is set,. the higher the 
likelihood that DNA with higher homology will be obtained. 

The hybridizable DNA mentioned above may be , for example, 
naturally occurring DNA (for example, cDNA and genomic DNA). For 
naturally occurring DNA, organisms used for isolation of DNA encoding 
the functionally equivalent protein include, but are not limited to, 
for example, human, mouse, rat, cattle, monkey, pig, etc. For example, 
in such animals, in a working example described herein, the DNA of 
the present invention was isolated using cDNA derived from a tissue 
(for example, testis) in which mRNA capable of hybridizing to cDNA 
encoding the protein of the present invention was detected. DNA 
encoding the proteins of the present invention may be cDNA or genomic 
DNA, as well as synthetic DNA. 

The present invention also provides for a method of screening 
for substrates of the proteins of the present invention. In the 
context of the present invention, the term "substrate" of the 
proteins of the present invention refers to a compound that is 
decomposed or cleaved at a specific site upon the binding of a protein 
of the present invention. 

The compounds to be used as substrates are not restricted to 
proteins. For example, trypsin and chymotrypsin are known to cleave 
not only proteins but also amide and ester bonds in the derivatives 
of peptidic compounds (Farmer, D.A. et al . , J. Biol. Chem., (1975) 
250: 7366-7371 ; del Castillo, L.M. et al . , Biochim. Biophy s . Acta . , 
(1971) 235: 358-69). Thus, in the present invention, there is no 
limitation on the types of substrates so long as they are decomposed 
or cleaved at a specif ic site upon the binding of a protein of the 


present invention. Such substrates may be peptides, analogues or 
derivatives (peptidic compounds) thereof, or non-peptidic 
compounds . 

The method of screening for the substrates of the present 
invention comprises the steps of: (a) contacting a test sample with 
any of the protein of the present invention, (b) detecting the 
protease activity of the protein of the present invention against 
the test sample, and (c) selecting a compound that is decomposed or 
cleaved by the protease activity of the protein of the present 
invention . 

Test samples used for screening are those expected to contain 
the substrates for the protein of the present invention, including, 
but not limited to, for example, cell extracts, extracts from animal 
tissues, expressed products of a gene library, purified or crude 
proteins, peptides, peptidic analogues or derivatives, non-peptidic 
compounds, synthetic compounds, and naturally occurring compounds. 

In the screening of the substrates capable of binding to the 
proteins of the present invention, for example, a test sample is mixed 
with a protein of the present invention, and the mixture is incubated. 
Subsequently, a change within the test sample (cleavage or 
decomposition) is assayed. For example, when the test sample is a 
protein, the test sample can be assayed directly, or after azidated 
or bound to a fluorescent substance, to detect its changes in UV 
spectrum (Beynon, R. J. and Bond, J. S., Proteolytic enzymes (1989) 
IRL Press, pp. 25-55) and HPLC (Maier M, et al . , FEBS Lett. , (1988) 
232: 395-398; Gau W, et al . Adv. Exp. Med. Biol. (1983) 156: 483-494) 
before and after the reaction, thereby measuring the protease 
activity. 

When the test sample is a peptide (or an analogue or derivative 
thereof), such peptide (or an analogue or derivative thereof) 
consisting of several amino acids (often, but not limited to, one 
to five amino acid residues) is mixed with a protein of the present 
invention, and incubated. Subsequently, changes within the test 
sample are assayed. For example, the test sample may be labeled with 
a fluorescent compound (MEC: Kawabata S . et al . (1988) Eur. J. 
Bio.chem.,. 172: 17-25; AMC : Morita T. et al . (1977) J. Biochem. , 
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(Tokyo). 82 :■ 1495-1498 ; AFC : Garrett JR, et al . (1985) Histochem. 
J., 17:805-817, etc.) at the carboxyl terminus. Then the protease 
activity may be assayed being indexed by the spectral changes of the 
fluorescent compound upon the cleavage of the test sample. Screening 
methods utilizing other fluorescently labeled peptide substrates can 
be used (Beynon, R. J. and Bond, J. S., Proteolytic enzymes (1989) 
IRL Press, pp. 25-55 ; Gossrau , R . , et al . (1984) Adv. Exp. Med. Biol., 
167: 191-207 ; and Yu , J . X . et .al., J. Biol. Chem., (1994) 269: 
18843-18848) . 

In addition, the principle of the above-mentioned methods can 
be applied to the screening by using, as the test compounds, synthetic 
compounds, a bank of naturally occurring substances, a lambda phage 
peptide display library, pin peptide synthetic compounds , etc. Also, 
high- throughput screening is possible by utilizing a combinatorial 
chemistry techniques (Wrighton , N . C . , Farrell , F.X., Chang, R, 
Kashyap, A.K., Barbone, F.P., Mulcahy, L.S., Johnson, D.L., Barrett, 
R.W. , Jolliffe, L.K. , Dower, W. J. , "Small peptides as potent mimetics 
of the protein hormone erythropoietin". Science (UNITED STATES) , Jul 
26, 1996, 273, p458-64; Verdine, G.L., "The combinatorial chemistry 
of nature", Nature (ENGLAND), Nov 7, 1996, 384: 11-13; Hogan, J.C. 
Jr., "Directed combinatorial chemistry", Nature (ENGLAND), Nov 7, 
1996, 384: 17-19) . 

Once substrates for the proteins of the present invention are 
isolated by using the screening method mentioned above, screening 
for inhibitors of the proteins of the present invention may then be 
conducted, the inhibitors being indexed by their inhibitory activity 
against the protease activity of the proteins of the present 
invention to the substrates. Thus the present invention also 
provides for a method of screening for compounds inhibiting the 
activity of the proteins of the present invention. 

This method comprises the steps of: (a) contacting a protein 
of the present invention with its substrate in the presence of a test 
sample, (b) detecting protease activity of the protein of the present 
invention to the substrate, and (c) selecting a compound capable of 
lowering the protease activity relative to that detected in the 
absence of the test sample. 


The proteins of the present invention useful for screening 
include authentic proteins, recombinant proteins, and partial 
peptides derived therefrom. Test samples useful for screening 
include, but are not limited to , cell culture supernatant, expression 
products of a gene library, peptides , peptide analogues or 
derivatives, purified or crude proteins (including antibodies), 
non-peptidic compounds, synthetic compounds, products from 
fermentation of microorganisms, extracts from marine organisms, 
plant extracts, cell extracts, extracts from animal tissues, etc. 

Screening for inhibitors of the proteins of the present 
invention can be performed/for example, by using the systems as 
described in the following references (Beynon, R. J. arid Bond, J. 
S., Proteolytic enzymes (1989) , IRL Press, pp. 25-55; Maier, M. et 
al. (1988) FEBS Lett. 232: 395-398; Gau, W. et al. Adv. Exp. Med. 
Biol., (1983) 156: 483-494; Kawabata , S. et al . (1988) Eur. J. Biochem. 
172: 17-25; Morita, T. et al . (1977) J. Biochem., (Tokyo) 82: 
1495-1498; Garrett, J. R. et al . (1985) Histochem. J. 17: 805-817; 
Gossrau, R. et al . (1984) Adv. Exp. Med. Biol. 167: 191-207; Yu , J. 
X. etal., (1994) J. Biol. Chem. , 269: 18843-18848). Further, given 
that a peptide substrate is a lead compound, compounds that have 
resulted from modification or substitution of a part of the structure 
of the lead compound can be used as the test compounds in the screening 
for inhibitors of the proteins of the present invention (Okamoto, 
S. et al. (1993) Methods Enzymol . , 222: 328-340). 

As described above, expression patterns and such of the 
proteins of the present invention suggest that the proteins of the 
present invention may be involved in sperm differentiation and 
maturation, or sperm function (fertilization) . Inhibitors that are 
isolated using the screening method of the invention can be utilized 
to analyze the involvement of the proteins of the present invention 
in fertilization. For example, the inhibitors of the proteins of 
the present invention may be used for in vitro analysis of 
fertilization (Y. Toyoda etal., 1971, Jpn . J. Anim. Reprod., 16: 
147-151; Y. Kuribayashi et al. , 1996, Fertil. Steril . , 66: 1012-1017), 
which can subsequently be used to determine whether the inhibitors 
are capable of inhibiting fertilization or not. Such an inhibitor 
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of a protein of the present invention that is capable of inhibiting 
fertilization finds potential utility as, for example, a new 
contraceptive. 

The compounds obtained by the screening method of the present 
5 invention may find practical utility as drugs for treating humans 
and other mammals , such as mice, rats, guinea pigs , rabbits, chicken, 
cats, dogs, sheep, pigs, cattle, monkeys, sacred baboons, and 
chimpanzees, according to a conventional means. 

For example, the drugs can be administered orally, in the form 
10 of tablets coated with sugar, if necessary, capsules, elixirs or 
microcapsules, or they can be administered parenterally , in the form 
of injections of sterile solutions of water or other pharmaceutical^ 
acceptable solutions, or suspensions . For example, a compound having 
the activity to bind to a protein of the present invention can be 
15 mixed with a physiologically acceptable carrier, flavoring agent, 
excipient, vehicle, preservative, stabilizer, and/or bonding agent 
in the form of a unit dose that is required for pharmaceutical 
implementations accepted in general. These active ingredients 
enable the preparations to be obtained in a suitable volume within 
20 the indicated volume range. 

Examples of additives that can be mixed into tablets and 
capsules include, but are not limited to, binders, such as gelatin, 
corn starch, tragacanth gum, and arable gum; excipients , such as 
crystalline cellulose; swelling agents , such as cornstarch , gelatin, 
25 and alginic acid; lubricants such as magnesium stearate; sweeteners 
such as sucrose, lactose, and saccharin; and flavoring agents such 
as peppermint, Gaultheria adenothrix oil , and cherry. When the unit 
dosage form is a capsule, a liquid carrier, such as oil , can also 
be included in the above additives. Sterile compositions for 
30 injections can be formulated by following standard drug 
implementations provided for dissolving or suspending active 
substances in such a vehicle as distilled water, or natural vegetable 
oils, such as sesame oil and coconut oil. 

For example, physiological saline and isotonic liquids 
35 including glucose or other adjuvants, such as D-sorbitol, D-mannose, 
D-mannitol, and sodium chloride, can be used as aqueous solutions 
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for injections. These can be used in conjunction with suitable 
solubilizers , including, but not limited to, alcohol, specifically 
ethanol, polyalcohols such as propylene glycol and polyethylene 
glycol, non-ionic surfactants, such as Polysorbate 80 (TM) and 
5 HCO-50 . 

Sesame oil or soybean oil can be used as an oleaginous liquid 
and may be used in conjunction with a solubilizer, such as benzyl 
benzoate and benzyl alcohol. In addition, such a liquid can be 
combined with a buffer, such as phosphate buffer and sodium acetate 

10 buffers; a pain-killer, such as benzalkonium chloride and procaine 
hydrochloride; a stabilizer, such as benzyl alcohol and phenol; and 
an anti-oxidant. . The prepared injection is usually filled into a 
suitable ampoule . 

Although the doses of the compounds that are obtained by the 

15 screening method of the present invention varies according to the 
symptoms, typically, an amount of about 0 . 1 to about 100 mg per day, 
preferably, about 1 . 0 to about 50 mg per day, and more preferably, 
about 1.0 to about 20 mg per day is administered orally to an adult 
(body weight 60 kg) . 

20 When administered parenterally , doses will differ, depending 

on the patient, target organ, symptoms and method of administration. 
The daily dose of, usually about 0.01 to about 30 mg, preferably about 
0.1 to about 20 mg and more preferably about 0.1 to about 10 mg for 
an adult (body weight 60 kg) is advantageously administered by 

25 intravenous injection. For administration to other animals, the 
amount is converted to 60 kg of body-weight. 

The present invention further provides antibodies capable of 
binding to a protein of the present invention. Such antibodies can 
be utilized for detection and purification of the protein of the 

30 present invention, as well as for in vitro analysis for fertilization. 
An antibody can be obtained as a monoclonal antibody or a polyclonal 
antibody by using a well-known method. 

An antibody that specifically binds to a protein of the present 
invention can be prepared, by using the protein of the present 

35 invention as a sensitizing antigen for immunization, according to 
a standard immunizing method, by fusing the immune cells obtained 


with any known parent cells, using a conventional method of cell 
fusion, and by screening for the cells producing an antibody, using 
a standard screening technique. 

Specifically, a monoclonal or polyclonal antibody that 
specifically binds to the proteins of the present invention may be 
prepared as follows. 

For example, the protein of the present invention that is used 
as a sensitizing antigen for obtaining the antibody is not restricted 
by the animal species from which it is derived, but is preferably 
a protein derived from mammals , for example, humans, mice, or rats, 
especially from humans ... Proteins of human origin can be obtained 
based on the nucleotide sequence or amino acid sequence disclosed 
herein . 

A protein to be used as a sensitizing antigen in the present 
invention may be a protein of the present invention or a partial 
peptide thereof. Partial peptides of a protein include, for example, 
amino (N) terminal fragments of the protein, and carboxyl (C) 
terminal fragments. In the context of the present invention, the 
term "antibody" of the present invention refers to an antibody that 
binds to the full-length protein or a fragment thereof. 

A gene encoding a protein of the present invention or a fragment 
thereof is inserted into a well-known expression vector system, and 
the host cells described herein are transformed. Subsequently, the 
protein of interest or a fragment thereof is obtained from the host 
cells or the culture medium, using a well-known method, and used as 
a sensitizing antigen. Also, cells expressing the protein and lysate 
thereof, and a chemically synthesized protein of the present 
invention and a partial peptide thereof may be used as sensitizing 
antigens . 

•Mammals that can be immunized with the sensitizing antigens 
generally include, but are not limited to, Rodentia , Lagomorpha and 
Primates. To generate monoclonal antibodies, it is preferable to 
select a mammal by considering its compatibility with parent cells 
used for cell fusion. 

Animals belonging to Rodentia" include , but are not limited to, 
for example, mice, rats, hamsters, etc. Animals belonging to 
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Lagomorpha include, but are not limited to, for example, rabbits, 
and Primates include, but are not limited to, for example, monkeys. 
Among monkeys, monkeys of the infraorder Catarrhini (Old World 
monkeys), for example, cynomolgus monkeys , rhesus monkeys, sacred 
5 baboons, chimpanzees, are used. 

Any of a number of well-known methods may be used to immunize 
animals with a sensitizing antigen. For example, the sensitizing 
antigen is generally injected into mammals intraperitoneally or 
subcutaneously . Specifically, the sensitizing antigen is diluted 
10 or suspended with a buffer, such as physiological saline and 
phosphate-buffered saline (PBS) , to be prepared in an appropriate 
amount, and, if desired, mixed with a suitable amount of a common 
adjuvant, such as Freund' s complete adjuvant. The antigen thus 
prepared may be emulsified and then injected into the mammal. 
15 Thereafter, the sensitizing antigen suitably mixed with Freund' s 
incomplete adjuvant is preferably challenged several times at four 
to 21 day intervals. A suitable carrier can also be used when an 
animal is immunized with the sensitizing antigen. After the 
immunization, elevation of the level of the desired antibody in the 
2 0 serum antibody is confirmed by a conventional method. 

To obtain polyclonal antibodies against the proteins of the 
invention, blood is removed from the mammal sensitized with the 
antigen after the level of the desired antibody is confirmed to 
increase in the serum. Serum may be isolated from the blood by any 
25 well-known method. The serum containing the polyclonal antibody may 
be used as the polyclonal antibody, and further, if necessary, the 
fraction containing the polyclonal antibody may be isolated from the 
serum. 

To obtain monoclonal antibodies , after verifying that the level 
30 of the desired antibody has been increased in the serum of the mammal 
sensitized with the above-described antigen, immunocytes are taken 
out from the mammal and used for cell fusion. In this procedure, 
preferable immunocytes for cell fusion are splenocytes in particular 
Parent cells to be fused with the above immunocytes are preferably 
35 mammalian myeloma cells. 

Cell fusion of the above immunocytes and myeloma cells may be 
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routinely carried out using any well-known method, for example, the 
method of Milstein et al . (Galfre, G. and Milstein, C, Methods 
Enzymol . , (1981) 73: 3-46) . 

Hybridomas obtained from the cell fusion are screened for 
5 selection by culturing them in a usual selective culture medium, for 
example, HAT culture medium (a medium containing hypoxanthine , 
aminopterin and thymidine) . The culture in the HAT medium is 
continued for a sufficient period to eliminate the cells (non-fusion 
cells) except for the hybridomas of interest, usually for a few days 

10 to a few weeks . Subsequently, conventional limiting dilution 
analysis is performed to screen for and clone the hybridoma producing 
the antibody of interest. 

In addition to obtaining the hybridomas mentioned above, by 
immunizing an animal other than human with the antigen, human 

15 lymphocytes, for example, human lymphocytes infected with EB virus, 
can be sensitized in vitro with a protein, protein-expressing cells 
or lysates thereof, and the sensitized lymphocytes can then be fused 
with myeloma cells derived from human that have the capacity of 
permanent cell division, for example U266, to obtain a hybridoma 

20 producing the human antibody of interest that comprises the binding 
activity to the protein (Unexamined Published Japanese Patent 
Application (JP-A) No. Sho 63-17688). 

Moreover, a transgenic animal having a human antibody gene 
repertoire is immunized with an antigen, such as a protein, 

25 protein-expressing cells and cell lysate thereof to obtain 
antibody-producing cells, which are then fused with myeloma cells 
to obtain hybridomas. The hybridomas may be used to obtain a human 
antibody against the protein (WO92/03918, W093/2227, WO94/02602, 
W094/25585, W096/33735, and WO96/34096) . 

30 Instead of producing antibodies from hybridomas, 

antibody-producing immunocytes such as sensitized lymphocytes that 
are immortalized with an oncogene may be used. 

Such monoclonal antibodies, obtained as described above, can 
be produced as recombinant antibodies using genetic engineering 

35 techniques (for example, see Borrebaeck, C.A.K. and Larrick, J.W., 
THERAPEUTIC MONOCLONAL ANTIBODIES, Published in the United Kingdom 
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by MACMILLAN PUBLISHERS LTD, 1990) . A recombinant antibody may be 
produced as follows: the DNA encoding the antibody is cloned from 
a hybridoma or immunocytes , such as sensitized lymphocytes producing 
the antibody, and incorporated into. a suitable vector, which is then 
introduced into a host to produce the antibody. The present invention 
encompasses such recombinant antibodies as well . 

The antibody of the present invention may be an antibody 
fragment or a modified antibody, so long as it binds to a protein 
of the present invention. For example, antibody fragments include 
Fab, F(ab')2, Fv, or single chain Fv in which the H chain Fv and the 
L chain Fv are suitably linked via a linker (scFv, Huston, J.S. et 
al . , Proc. Natl. Acad. Sci. USA, (1988) 85:5879-5883). Specifically, 
antibody fragments can be produced by treating an antibody with an 
enzyme, for example, papain, pepsin, etc. Alternatively, a gene 
encoding any of the antibody fragments can be constructed, introduced 
into an expression vector , and then expressed in suitable host cells 
(for example, see Co, M. S. et al . , J. Immunol., (1994) 152: 
2968-2976; Better, M . and Horwitz , A . H., Methods Enzymol., (1989) 
178: 476-496 ; Pluckthun, A. and Skerra, A. , Methods Enzymol ., (1989 ) 
178: 497-515; Lamoyi , E., Methods Enzymol., (1986) 121: 652-663; 
Rousseaux, J. et al ., Methods Enzymol., (1986) 121: 663-669; Bird, 
R. E. and Walker, B. W., Trends Biotechnol. , (1991) 9: 132-137). 

Any antibodies bound to various molecules, such as polyethylene 
glycol (PEG) , can be used as modified antibodies . The "antibody" 
in the context of the present invention encompasses such modified 
antibodies as well. To obtain such a modified antibody, the antibody 
obtained may be chemically modified. These methods are well 
established in the art. 

The antibody of the present invention may be obtained as a 
chimeric .antibody, comprising a variable region derived from a 
non-human antibody and a constant region derived from a human 
antibody by using conventional techniques. Alternatively, the 
antibody of the present invention may . be . obtained as a humanized 
antibody, comprising a complementarity determining region ( CDR) 
derived from a non-human antibody, a framework region (FR) derived 
from a human antibody, and a constant region. 
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Antibodies thus obtained can be purified to a homogenous state. 
The antibodies used in the present invention may be separated and 
purified by any conventional methods used for separation and 
purification of proteins. There is no limitation to such method at 
5 all. Concentration of the above mentioned antibodies can be 
determined by measuring absorbance, or by the enzyme-linked 
immunosorbent assay (ELISA) , etc. 

Assays for antigen-binding activity of the antibody of the 
present invention include, but are not limited to, ELISA, enzyme 

10 immunoassay (EIA) , radio immunoassay (RIA) , and immunofluorescence. 
For example, when ELISA is used, a protein of the present invention 
is placed in a plate coated with the antibody of the present invention, 
and subsequently, a sample containing the antibody of interest, for 
example, a culture supernatant of the cells producing the antibody 

15 or a purified antibody, is added to the plate. A secondary antibody 
that recognizes the antibody, labeled with an enzyme such as alkaline 
phosphatase, is added to the plate, which is then incubated and washed. 
Subsequently, an enzyme substrate, such as p-nitrophenyl phosphate, 
is added to the plate, and the antigen-binding activity is estimated 

20 by measuring the absorbance. As a protein, a fragment of the protein, 
such as a fragment comprising the C-terminal or N-terminal region, 
may be used. To evaluate the activity of the antibody of the present 
invention, BIAcore (Pharmacia) may be used. 

By using these techniques, a method for detecting., or 

25 determining the proteins of the present invention can be carried out, 
which method comprises the steps of contacting an antibody of the 
present invention with a sample presumed to contain a protein of the 
present invention and of detecting or determining the immune complex 
formed between the antibody and the protein. Since the method of 

30 the present invention for detecting or determining proteins can 
specifically detect or assay the proteins, it is useful in various 
experiments using proteins. 

In addition, the present invention also provides nucleotides 
specifically hybridizing to the DNA of the nucleotide sequences shown 

3 5 in SEQIDNOs: 1, 3, 5, 7 and 9 , (or complementary DNA thereof ), which 
nucleotides have a chain length of at least 15 nucleotides. As used 
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herein,. the term "specifically hybridizing" indicates that 
cross-hybridization does not significantly occur with DNA .encoding 
other proteins under the usual hybridization conditions, preferably 
under stringent hybridization conditions. Such nucleotides are 
5 available as probes for detecting or isolating DNA that encodes a 
protein of the present invention, or as a primer for amplification. 
Taking the temperature for hybridization reaction, duration of the 
reaction, concentration of the probe or primer, length of the probe 
or primer, ionic strength, and others into account, those skilled 
10 in the art can properly select the stringency for the specific 
hybridization. 

The mouse "Tespec PRO-1" and "Tespec PRO-2" genes of the present 
invention are specifically expressed in the testis. It is also 
believed that the genes are specifically expressed in mouse germ 
15 cells of 18 day old or older. Accordingly, these DNA can also be 
available as markers (diagnostics) for germ cells. In addition, 
since the genes of the present invention are thought to be involved 
in sperm differentiation and maturation, and/or sperm functions 
including the establishment of fertilization, these DNA are 
20 available for examination of infertility. 

Further, "nucleotides specifically hybridizing to DNA 
comprising any one of the nucleotide sequences shown in SEQ ID NOs : 
1, 3, 5, 7 and 9 (or complementary DNA thereof), which nucleotides 
have a chain length of at least 15 nucleotides" also include, for 
25 example, antisense oligonucleotides and ribozymes . An antisense 
oligonucleotide acts on a cell that produces a protein of the present 
invention to bind to DNA or mRNA encoding the protein, thereby 
inhibiting the transcription or translation, or enhancing 
degradation of the mRNA. Antisense oligonucleotides thus inhibit 
30 the expression of the proteins of the present invention, resulting 
in suppression of the functions of the proteins of the present 
invention. Such antisense oligonucleotides include, for example, 
an antisense oligonucleotide capable of hybridizing to a definite 
region of the nucleotide sequences shown in SEQ ID NOs: 1, 3, 5, 7 
35 and 9. Such antisense oligonucleotides are preferably antisense 
oligonucleotides complementary to at least consecutive 15 
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nucleotides contained in any of the nucleotide sequences shown in 
SEQ ID NOs: 1, 3, 5 , 7 and 9 . More preferably , the above- mentioned 
antisense oligonucleotides have at least 15 continuous nucleotides 
containing the translation start codon . 

Derivatives or modifications of the antisense oligonucleotides 
can also be used as antisense oligonucleotides . Such modifications 
include, but are not limited to, for example, lower alkyl phosphonate 
modifications, such as methyl-phosphonate or ethyl-phosphonate 
types; phosphorothioate modifications or 

phosphoroamidate-modif ications , etc . 

The antisense oligonucleotides include not only those having 
the nucleotides complementary to all the corresponding sequence of 
those constituting the given region of the DNA or mRNA, but also the 
oligonucleotides having one or more mismatches , as long as the DNA 
or mRNA and the oligonucleotides can selectively and stably hybridize 
with any of the nucleotide sequences of SEQ ID NOs: 1, 3, 5,7 and 
9. Such oligonucleotides are nucleotide sequence regions comprising 
at least 15 continuous nucleotides and exhibiting at least 70% 
homology, preferably at least 80% homology, more preferably at least 
90% homology, most preferably at least 95% homology to the nucleotide 
sequence. The algorithm to determine the sequence homology mentioned 
in the references above. 

The antisense oligonucleotides of the present invention can be 
made into an external preparation, such as a liniment or poultice, 
by mixing with a suitable base material which is inactive against 
the antisense oligonucleotides. Also, as needed, the antisense 
oligonucleotides can be formulated into tablets, powders, granules, 
capsules, liposome capsules,' injections, solutions, nose-drops, and 
freeze-dried agents by adding excipients , isotonic agents, 
solubilizers , stabilizers, preservatives, pain-killers, etc. These 
can be prepared using the usual methods. 

The antisense oligonucleotide derivatives of the present 
invention can be applied both in vivo and in vitro. They can be 
administered to the patient by directly applying onto the ailing site, 
or by injecting into a blood vessel and such, so that it will reach 
the ailing site. An / antisense-mounting material can also be used 
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to increase durability and membrane-permeability.- Such materials 
include, but are not limited to, for example, liposome, poly-L lysine, 
lipid, cholesterol, lipofectin, and derivatives of these. 

The dosage of the antisense oligonucleotide derivative of the 
5 present invention can be adjusted suitably according to the patient's 
condition and used in desired amounts. For example, a dose ranging 
from 0.1 to 100 mg/kg, preferably 0.1 to 50 mg/kg, can be 
administered. 

10 Brief Description of the Drawings 

Figure 1 shows the mouse "Tespec PRO-1" cDNA sequence and the 

amino acid sequence thereof. The active sites of trypsin-f amily 

serine protease are indicated by underlines. The poly A signal is 

marked with a wavy line. 
15 Figure 2 shows mouse "Tespec PRO-2" cDNA sequence and the amino 

acid sequence thereof. The active sites of trypsin-f amily serine 

protease are indicated by underlines. The poly A signal is marked 

with a wavy line. 

Figure 3 shows an alignment of amino acid sequences of mouse 
20 "Tespec PRO-1", "Tespec PRO-2" and known proteases. Amino acids 

conserved among all the proteins are marked with "*" and amino acids 

with similar characteristics are marked with " " ". The active sites 

of trypsin-f amily serine protease are boxed. 

Figure 4 shows a result of amplification of the cDNA for mouse 
2 5 "Tespec PRO-1" and "Tespec PRO-2" by RT-PCR using mouse testis RNA . 

Positions of primers used are indicated in the top panel and the 

electrophoretic pattern of the products amplified by RT-PCR is 

indicated in the bottom panel . 

Figure 5 shows a schematic illustration indicating the 
30 structures of mouse "Tespec PRO-1" and "Tespec PRO-2" as well as 

splicing isoforms thereof. The numbers indicated below the boxes 

are the numbers of the nucleotides. 

Figure 6 shows tissue-specific expression of mouse "Tespec 

PRO-1" and "Tespec PRO-2" by RT-PCR. Positions of the primers used 
35 are indicated in the top panel and the electrophoretic pattern of 

the products amplified by RT-PCR is indicated in the bottom panel. 
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1; liver, 2; brain, 3 ; thymus , 4; heart, 5; lung, 6; spleen, 7; testis, 
8; ovary, 9; kidney, 10; fetus of day 10-11, 11; distilled water 
(control). - 

Figure 7 shows tissue-specific expression of mouse "Tespec 
5 PRO-1" and "Tespec PRO-2" investigated by Northern blotting.. 
Positions of the primers used are indicated in the top panel and the 
result of the Northern blotting is indicated in the bottom panel. 
1; 7-day-old embryo, 2 ; 1 1-day-old embryo, 3; 15-day-old embryo, 4; 
17-day-old embryo, 5; heart, 6; brain, 7; spleen, 8; lung, 9; liver, 

10 10; skeletal muscle, 11; kidney, 12; testis. 

Figure 8 shows the time of expression of mouse "Tespec PRO-1" 
and "Tespec PRO-2" in the testis by RT-PCR analysis. 1; W/Wv testis 
No . 1 , 2; W/Wv testis No. 2, 3; W/Wv testis No . 3 , 4; testis of 4 days 
after birth, 5; testis of 8 days after birth, 6; testis of 12 days 

15 after birth, 7; testis of 18 days after birth, 8; testis of 42 days 
after birth, 9; adult testis, 10; adult liver, 11; distilled water 
(control) . 

Figure 9 shows the human "Tespec PRO-2" cDNA sequence and the 
amino acid sequence thereof. The active sites of trypsin-f amily 

20 serine protease are indicated by underlines . The poly A signal is 
marked with a wavy line. 

Figure 10 shows a comparison of nucleotide sequence between 
mouse and human "Tespec PRO-2". The nucleotides conserved between 
the two are boxed. ....... 

25 Figure 11 shows a comparison of amino acid sequence between 

mouse and human "Tespec PRO-2". Amino acid residues shared between 
the two are indicated by "*" and amino acid residues with similar 
characteristics are indicated by " • ". The active sites of 
trypsin-f amily serine protease. are boxed. 

30 Figure 12 shows a result of PCR f or chromosomal mapping of human 

"Tespec PRO-2" . 

Figure 13 shows the nucleotide and amino acid sequences of human 
"Tespec PRO-3" cDNA. The active sites of trypsin-f amily serine 
protease are indicated by underlines. The poly A signal is marked 
35 with a wavy line. 

Figure 14 shows a comparison of nucleotide sequence homology 


111 11 III "313 


33 


in regard to "Tespec PRO-1" and "Tespec PRO-3". Homologies of the 
nucleotide sequences are compared using full-length of mouse "Tespec 
PRO-l" , an about 400-bp region of EST from mouse "Tespec PRO-3", and 
an about 20 0-bp region of human "Tespec PRO-3" obtained by RT-PCR 
5 under a low stringency condition as described in Example 9. 

Figure 15 shows the mouse "Tespec PRO-3" cDNA sequence and the 
amino acid sequence thereof. The active sites of trypsin-f amily 
serine protease are indicated by underlines . The poly A signal is 

marked with a wavy line. 
10 Figure 16 shows a comparison of nucleotide sequence between 

mouse "Tespec PRO-3" (m. Tespec PRO-3) and human "Tespec PRO-3" (h. 

Tespec PRO-3) . Nucleotides conserved between the two are boxed. 

Figure 17 shows a comparison of amino acid sequence between 

mouse "Tespec PRO-3" (m. Tespec PRO-3) and human "Tespec PRO-3" (h. 
15 Tespec PRO-3). Amino acid residues conserved between the two are 

boxed. 

Best Mode for Carrying out the Invention 

The present invention is illustrated more specifically below 
20 with reference to Examples , but is not to be construed as being 
limited to the examples described below. 

Example 1. Isolation of "Tespec PRO-1" gene fragment 

A mixture of plasmids derived from 5 x 10 4 clones, was isolated 

25 and purified from a plasmid library of mouse heart cDNA (GIBCO, 5 
x 10 9 cfu/ml). By using the plasmid mixture as a template , PCR 
amplification was performed according to the following procedure, 
using the primer "76A5sc2-B" specific to the gene that was named 
"76A5sc2" by the present inventors and the vector primer "SPORT RV" . 

30 superscript Mouse heart cDNA library and Superscript Mouse 

testis cDNA library (GIBCO, 5 x 10 9 cfu/ml) were diluted 1:100. 1 
HI aliquots of the diluted solutions were added to each of 16 tubes 
' containing 3 ml of LB-Amp medium, and the mixtures were incubated 
at 30 °C. Then the mixtures of plasmids were prepared with the QIAspin 

35 mini-prep kit (QIAGEN) (each plasmid preparation contains mixture 
of plasmids derived from 5 x 10 4 independent clones) . Using the 
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plasmids from the mouse heart cDNA library as templates, PCR was 
carried out with Ampli Taq Gold (Perkin Elmer) as polymerase and the 
primer pair of 76A5sc2-B (SEQ ID NO: 11/ 5 '-GAT CMA CAG GTG CCA GTC 
ATC A-3') and SPORT SP6 (SEQ ID NO: 127 5 ' -ATT TAG GTG ACA CTA TAG 
AA-3 • ) _ The thermal cycling profile was: a pre-heat at 95°C for 12 
minutes, 40 cycles of denaturation at 96 °C for 20 seconds, annealing 
at 55°C for 20 seconds and extension at 72°C for 2 minutes , and 
subsequent final extension at 72 °C for 3 minutes. 

The PCR reactions were subjected to electrophoresis on a 1.5% 
agarose gel. PCR products of about 0.7 Kb were cut out from the gel 
and then recovered by QIAquick Gel Extraction Kit (QIAGEN) . The PCR 
products were cloned into pGEM T easy vectors (PROMEGA) by TA cloning 
using T4 DNA ligase (PROMEGA) . 

Eight colonies were selected from the colonies emerged, and the 
inserted fragments were amplified by colony PCR as follows. 

The bacteria from each colony, which contain the recombinant 
gene, were directly suspended in 20 |I1 of PCR reaction solution 
containing a pair of the primers, SPORT FW (SEQ ID NO: 13/ 5 1 — TGT 
AAA ACG ACG GCC AGT-3 ' ) and SPORT RV(SEQ ID NO : 14/ 5 ' -CAG GAA ACA 
GCT ATG ACC-3 ' ) , and KOD dash polymerase. PCR was performed by 
employing a thermal cycling prof ile of pre-heat at 94 "C for one minute , 
subsequently 32 cycles of denaturation at 96 °C for 15 seconds, 
annealing at 55 °C for 5 seconds, and extension at 72 °C for 25 seconds. 

The amplification of the PCR products of interest was verified 
by agarose gel electrophoresis. If desired, the PCR products were 
purified by gel filtration with Microspin S-300 or S-400 (Pharmacia) . 

The PCR products from the above colony PCR or RT-PCR, were used 
as templates for sequencing. After the PCR reaction, the products 
generated were examined by agarose gel electrophoresis. If the. 
products were contaminated, the PCR product of interest was cut out 
from the agarose gel to remove the contaminants. Otherwise, the 
products were purified by the above-mentioned gel filtration. 
Sequencing was performed by cycle sequence using Dye Terminator Cycle 
Sequencing FS Ready Reaction Kit, dRhodamine Terminator Cycle 
Sequencing FS ready Reaction Kit,, or BigDye Terminator Cycle 
Sequencing FS ready Reaction Kit ( Perkin-Elmer ) . Primers used were 


SPORT FW and SPORT RV . Unreacted primers , nucleotide monomers, and 
the like were removed by using a 96-well precipitation HL kit (AGTC) . 
The nucleotide sequences, were determined in the ABI 377 or ABI 377XL 
DNA Sequencer (Perkin-Elmer ) . 

The result showed that seven plasmids contained the nucleotide 
sequence of 76A5sc2 and a single plasmid contained a distinct 
nucleotide sequence (the size of insert was about 0.5 Kb). This 
nucleotide sequence was then analyzed by searching the GCG database. 
Since this nucleotide sequence had an ORF , it was translated into 
an amino acid sequence. The amino acid sequence was also analyzed 
by searching the GCG database. The results showed that this gene 
fragment contained regions homologous to a number of known 
trypsin-family serine proteases at the nucleotide and amino acid 
levels. However, no known genes showed significant homology to this 
gene fragment over the entire regions, suggesting that this gene 
fragment has a novel origin. Further, the amino acid sequence was 
revealed to have a "Trypsin-His (PROSITE PS00134) " motif, one of the 
trypsin-family serine protease motifs. This also suggests that the 
gene fragment is derived from a novel protease gene. 

Example 2. Cloning of full-length cDNA of the "Tespe c PRQ-1" gene 
By using the plasmid obtained from the Superscript Mouse heart 

cDNA library in Example 1 as a template, plasmid library RACE was 
carried out employing Ampli Taq Gold as polymerase. The. primer sets 
used in this experiment were a pair of No9-C (SEQ ID NO: 15/ 5 1 -ATG 
CTT CTG CTA TCG TGG AAG G-3' ) , which was newly designed based on the 
gene fragment isolated in Example 1, and a vector primer, SPORT FW 
or SPORT T7 (SEQ ID NO: 16/ 5 ' -TAA TAC GAC TCA CTA TAG GG-3 1 ) , and 
a pair of the primer No9-B (SEQ ID NO: 17/ 5 ' -CTT TGT GCT GAG GTC 
TTC AGT G-3 1 ) , which was newly designed based on the gene fragment 
and a vector primer, SPORT RV. The thermal cycling profile of the 
PCR was: a pre-heat at 95 °C for 12 minutes, 42 cycles of denaturation 
at 96°C for 20 seconds, annealing at 55°C for 20 seconds and extension 
at 72 °C for 5 minutes, and subsequent final extension at 72 "C for 
3 minutes . 

The PCR products were identified by agarose gel electrophoresis 
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Further, for these PCR products, the nucleotide sequences were 
determined directly or after cloned into pGEM T easy vector. 

Since two PCR bands were obtained by 3 ' RACE , the nucleotide 
sequences thereof were determined. The sequencing revealed that one 
5 of the two had the nucleotide sequence of the other in which a poly 
A stretch is attached to an internal site in the nucleotide sequence. 

Likewise, 5 ' RACE also gave two PCR bands with different sizes. 
DNAs from the respective bands were subcloned, and their nucleotide 
sequences were determined. The result revealed that the two were 

10 identical to each other in nucleotide sequence at the 3' end, 
indicating that the two were different isoforms produced by 
alternative splicing. 

The nucleotide sequences from the shorter band generated by 5 ' 
RACE and the longer band generated by 3 1 RACE were ligated to each 

15 other to give a nucleotide sequence encoding the entire protease, 
which was designated "Tespec PRO- 1" (Testis spe cific expressed 
serine proteinase-^) . 

The resulting "Tespec PRO-1" cDNA contains 1033 nucleotides and 
is predicted to code for 321 amino acids (Figure 1) . The nucleotide 

20 sequence is shown in SEQ ID NO: 1 and the amino acid sequence is 
illustrated in SEQ ID NO: 2. The amino acid sequence contains a 
hydrophobic region at its N terminus , which is predicted to be a 
signal peptide. The amino acid sequence also has a region rich in 
hydrophobic amino acids at its C- terminus . .- . ._ : 

2 5 Based on the analytical search of the GCG, the amino acid 

sequence was proved to contain two types of trypsin-f amily serine 
protease motifs, "Trypsin-His (PROSITE PS00134) " and "Trypsin-Ser 
(PROSITE PS00135) " . PROSITE indicates "if a protein includes both 
the serine and histidine active site signatures, the probability of 

30 it being a trypsin family serine protease is 100%" (Brenner, S. , 1988, 
Nature, 334: 528-530 ; Rawlings, N . D. and Barrett, A. J. (1994) Meth . 
EnzyraoL. , 244: 19-61) . "Tespec PRO-1" therefore can be regarded as 
a trypsin-f amily serine protease. The nucleotide sequence of this 
gene and its deduced amino acid sequence were analyzed by searching 

35 the GCG database. The results showed that the two motifs mentioned 
above and flanking region thereof exhibits high homologies to known 
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tryps in-family serine proteases, such as acrosin, prostasin and 
trypsin. It was also revealed that the positions of aspartic acid 
residues required for the protease activity and the cysteine residues 
anticipated to be responsible for intramolecular disulfide bonding 
are well conserved relative to other proteases (Figure 3) . For the 
other region, however, no known genes or proteins were found to 
exhibit significant homology to this sequence at the nucleotide and 
amino acid levels, revealing that this protein is a novel 
trypsin-f amily serine protease. 

Example 3. Cloning of full-length cDNA of the "Tespec PRO-2" gene 

For the band with larger molecular weight (the band with a 
nucleotide sequence different from that of ""Tespec PRO-1" at the 5' 
end) , which was obtained during the cloning of "Tespec PRO-1" by 5' 
RACE in Example 2, 3' and 5' RACE were carried out using newly 
synthesized primers designed based on the nucleotide sequence of 
"Tespec PRO-1" (No9-G or No9-J) as well as using, as templates, the 
plasmid mixture obtained from the Superscript Mouse testis cDNA 
library in Example 1. 

Specifically, PCR was conducted by using primer pairs of No9-G 
(SEQ ID NO: 18/ 5 ' -CAG TCA ATG TCA CTG TGG TCA T-3 ' ) and SPORT FW , 
and No9-J (SEQ ID NO: 19/ 5 ■ -ACT TGC CGT TGG TGC CCA CTT C-3 ' ) and 
SPORT RV. In this PCR, Ampli Taq Gold was used as polymerase and 
its thermal cycling profile was as follows: a pre- heat at 95 °C for 
12 minutes, 42 cycles of denaturation at 96°C for 20 seconds, 
annealing at 55 °C for 20 seconds and extension at 72 °C for 5 minutes, 
and subsequent final extension at 72 °C for 3 minutes. 

The nucleotide sequences of the PCR products were determined 
directly or after cloned into pGEM T easy vector. 

Two 3' RACE products were obtained by 3' RACE, both of which 
were sequenced. By this analysis, the two nucleotide sequences were 
showed to have an identical region at their 5\ ends but distinct 
regions at their 3' ends. One of the sequences was identical to the 
aforementioned nucleotide sequence having the sequence of "Tespec 
PRO-1" in which a poly A stretch is attached to an internal site of 
the sequence. The other sequence contained a nucleotide sequence 
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different from that of "Tespec PRO-1" at its 3 ' end . 
Multiple bands were given by 5' RACE. Those bands were subcloned, 
and their nucleotide sequences were determined. The result showed 
that all these bands shares an identical 3' terminal sequence . Thus 
5 they are shown to be splicing isoforms . Since one of the 5' RACE 
products has a long ORF , the 5 ' RACE product and the above-mentioned 
3 ' RACE product whose nucleotide sequence is different from that of 
"Tespec PRO-1" at the 3' end were assembled together, thereby giving 
a nucleotide sequence presumed to encode a protease. This sequence 
10 was named "Tespec PRO-2". The nucleotide sequence is shown in SEQ 
ID NO: 3, and the deduced amino acid sequence is indicated in SEQ 
ID NO: 4 . 

"Tespec PRO-2" cDNA thus obtained consists of 1034 nucleotides 
(Figure 2) and its 5' non-coding region consists of 68 nucleotides. 

15 By contrast, the 3 ' -non-coding region of this cDNA is very shorter, 
consisting of only nine nucleotides. A putative poly A signal found 
in this cDNA is GATAAA, and it is predicted to be weaker signal as 
compared to the signal generally recognized in mRNAs (AAUAAA) . Based 
on the sequence of this cDNA, "Tespec PRO-2" is predicted to encode 

20 319 amino acids, which contains a possible region of signal peptide 
at its N-terminus . But, unlike "Tespec PRO-1", the protein does not 
contain a region rich in hydrophobic amino acids at its C-terminus . 
While the amino acid sequence contains a trypsin-f amily serine 
protease motif, "Trypsin-His" , the "Trypsin-Ser" motif of this 

25 protein (GKCQGDSGAPMV) contains 2 amino acid residues that are 
deviated from the consensus sequence of the motif that consists of 
12 amino acid residues 

( [DNSTAGC] - [GSTAPIMVQH] -X-X-G- [DE] -S-G- [GS] - [SAPHV] - 
[LIVMFYWH] - [LIVMFYSTANQH] ) . However, some known trypsin-f amily 

30 serine proteases have sequences that are different from the consensus 

sequence at several amino acid residues. "Tespec PRO-2" obtained 

is predicted to function as a protease. 

The nucleotide sequence of "Tespec PRO-2" and its deduced amino 

acid sequence were analyzed by searching the GCG database. The 
35 results showed that, like "Tespec PRO-1", the two motifs of "Tespec 
PRO-2" mentioned above and flanking region thereof exhibits high 
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homologies to known trypsin-f amily serine proteases. It was also 
revealed that the positions of aspartic acid residues required for 
the protease activity and the cysteine residues anticipated to be 
responsible for intramolecular disulfide bonding are highly 
conserved relative to other proteases (Figure 3). For the other 
region, however, no known genes or proteins were found to exhibit 
significant homology at the nucleotide and amino acid levels, 
revealing that this protein is a novel trypsin-f amily serine 
protease. 


Example 4. Splicing isoforms of "Tespec PRO-1" and "Tespec PRO-2" 
Homologies between "Tespec PRO-1" and "Tespec PRO-2" were 52.2% 
and 33.1% at the nucleotide and amino acid levels, respectively. 
These values are of similar extent , compared to those of other known 

15 trypsin-f amily serine proteases. 

The splicing isoform of "Tespec PRO-2" obtained by 5' RACE in 
Example 3 does not appear to encode a protease, since it contains 
multiple termination codons in the nucleotide sequence at the 
splicing junction and in the region that is missing in "Tespec PRO-2", 

20 which will prevent ORF extending. The splicing isoform was analyzed 
in more detail by RT-PCR as follows. 

Based on the nucleotide sequence obtained by cDNA cloning, 
primers were synthesized which include No9-P (SEQ ID NO: 20/ 5 ' -GCA 
CT.G GAA TGA CAA CAT GAT GC-3 ' ) , No9-Q (SEQ ID NO: 21/ 5 ' - ATT GGC GIG.. 

25 GCA AGT AGG AGC A-3') , No9-N (SEQ ID NO: 22/ 5 ' -CGA GTC TCC CAG TTA 
GCA CAG A-3-') , No9-M' (SEQ ID NO : 23/ 5 ' -CGG TGA CTT GGT CAT GTC TGT 
G-3') , No9-K (SEQ ID NO: 24/ 5 ' -GGA TCC ATG AAA CGA TGG AAG GAC AGA 
AG- 3') , No9-G, No 9- J, and No9-0 (SEQ ID NO: 25/ 5 ' — CGC AGA GTT CTG 
CTC ATA CAT A-3 ' ) . RT-PCR was performed by using these primers , cDNAs 

30 prepared from mouse tissue as templates, Ampli Taq Gold as polymerase 
and the thermal cycling prof ile of: pre-heating at 95°C for 12 minutes, 
40 cycles of denaturation at 96°C for 20 seconds, annealing at 60°C 
for 20 seconds and extension at 72°C for 1 minute, and subsequent 
final extension at 72°C for 3 minutes. PCR reactions were subjected 

35 to electrophoresis on a 1.5% Seakem GTG agarose (TaKaRa) . 

The results of RT-PCR analysis (Figures 4 and 5) showed that 
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isoforms having the boxes (2-1) - (2-III) - (2-VI) at the 5' end were 
appear to be dominant in the population of the splicing isoforms of 
"Tespec PRO-2". The population appears to be larger than that of 
"Tespec PRO-2". The RT-PCR analysis has verified cDNA isoforms with 
Box 2-1 in which the Box is connected via Box 2-VI to Box 2 -VI I or 
Box l-II (the latter is suspected to be a chimeric cDNA molecule with 
"Tespec PRO-1") . In contrast, the analysis also revealed that there 
is only a single type of cDNA isoform with Box 2-IIb, a chimeric cDNA 
with "Tespec PRO-1" in which the Box is connected via Box 2-VI to 
Box -l-II (Figures 4 and 5) . Such chimeras may be formed because 
"Tespec PRO-2" and "Tespec PRO-1" are located in the close proximity 
on the chromosome, as well as due to weak signal intensity of the 
poly A signal in "Tespec PRO-2 " . It remains to be clarified why such 
splicing isoforms (encoding only short proteins) that are seemingly 
meaningless exist. However, there is a possibility that the 
expression of "Tespec PRO-2" is regulated by splicing as well as 
transcriptionally. 

Example 5. Tissue distribution of the "Tespec PRO-1" and "Tespec 
PRO-2" genes 

Tissue distribution of "Tespec PRO-1" and "Tespec PRO-2" were 
investigated by RT-PCR. Total RNAs (Ambion) isolated from 10 types 
of adult mouse tissue (liver, brain, thymus, heart, lung, spleen, 
testis, uterus, kidney, and fetus of day 10-11) wore used to 
synthesize cDNA by reverse transcription using Superscript II 
(GIBCO) as a reverse transcriptase and using (dT) 3 oVN primer. The 
resulting cDNAs were used as templates for RT-PCR. QUICK-Clone cDNA 
from mouse 7-day embryo as well as 17-day embryo (CLONTECH) was also 

used as a template for RT-PCR. 

"Tespec PRO-l"-specific primers used were No9-A (SEQ ID NO: 26/ 

5 ' -GGC ATG TAG CTC ACT GGC ATG-3 ' ) and No9-B. "Tespec PRO-2 "-specif ic 
primers used were 29 (-) (SEQ ID NO : 27/ 5 ' -GGA CCA GCA AGA ATC AGT 
TCT G-3') and 17 ( + ) 95 ( + ) (SEQ ID NO: 28/ 5 ' -CTG CTA CCA GTT CTA ATT 
TGC C-3 ' ) . G3PDH control primers used were G3PDH 5 ' (SEQ ID NO: 29/ 
5 ' -GAG ATT GTT GCC ATC AAC GAC C-3') and G3PDH 3' (SEQ ID NO: 30/ 
5' -GTT GAA GTC GCA GGA GAC AAC C-3') . Polymerase used was Ampli Tag 
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Gold and the thermal cycling profile of PGR was: pre-heat at 95°C 
for 12 minutes, 42 cycles of denaturation at 9 6"C for 20 seconds, 
annealing at 60°C for 20 seconds and extension at 72 "C for 30 seconds 
(28 cycles for G3PDH) , and subsequent final extension at 72 °C for 
5 3 minutes. The PGR reactions were subjected to electrophoresis on 
a 1.5% Seakem GTG agarose (TaKaRa) . 

The result showed that both "Tespec PRO-l" and "Tespec PRO-2" 
were expressed in the testis at high levels (Figure 6) . Interestingly, 
it was also shown that these genes, despite of being cloned from the 
10 plasmid library of mouse heart cDNA, were hardly expressed in the 
heart. in the tissue other than the testis, the bands of interest 
were observed, though they were very faint. 

In addition, tissue distribution was analyzed by mouse MTN blot 
(CLONTECH) , using, as probes, a part of the coding region of "Tespec 
15 PRO-l" (the region containing the entire sequence of Box l-II; the 
nucleotide positions 110 to 401) and a region in the vicinity of exon 
2-VI of "Tespec PRO-2" (nucleotide positions 340 to 723) (this probe 
may be recognize "Tespec PRO-2" and all the splicing isoforms thereof, 
since it covers the region that is common to many of the splicing 
20 isoforms of "Tespec PRO-2", therefore it is not a "Tespec 
PRO-2"-specif ic probe) . 

The RT-PCR products amplified by using cDNAs from adult mouse 
testis as templates and No9-A and No9-B primers were labeled with 
[CC- 32 P] dCTP by using the Megaprime DNA labeling system (Amersherm) , 
25 and unreacted [0C- 32 P] dCTP was removed to give the "Tespec PRO-l" probe. 
Likewise, the "Tespec PRO-2" probe was prepared by PCR using No9-G 
and No9-J primers and subsequently by labeling with [a- 32 P] dCTP . The 
hybridization was carried out at 68 "C by using Mouse Multiple Tissue 
Northern (MTN) blot and Mouse Embryo Multiple Tissue Northern (MTN) 
30 blot (CLONTECH) in ExpressHyb Hybridization Solution (CLONTECH), 
according to the manufacturer's instruction. 

A band about 1.2 Kb in length was observed only in the testis 
by using the "Tespec PRO-l" probe (Figure 7). This band was not 
detected in the tissue other than the testis, as well as in the fetus. 
35 Like the "Tespec PRO-l" probe , the "Tespec PRO-2" probe also detected 
an about 1.2-Kb band only in the testis (Figure 7) . The band was 
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not detected in tissue other than the testis , as well as in the fetus. 

The results described above demonstrate that both "Tespec 
PRO-1" and "Tespec PRO-2" are specifically expressed in the testis. 

Example 6. Expression times of the "Tespec PRO-1" and "Tespec PRO-2" 

genes in the testis 

In mice,, the primordial germ cells emerge in the fetus 7 days 
after fertilization, and they migrate to the genital ridge (11 days 
after fertilization) and differentiate into precursor cells of 
spermatogonium (13 days after fertilization) . The precursor cells 
of spermatogonium enter into the arrested state from then on. They 
become spermatogonia, germ-line stem cells, after birth and then 
start their self-proliferation and differentiation into sperm. It 
takes about 34 days for spermatogonia to differentiate via 
spermatocytes and spermatids into mature sperm ( in actuality , since 
spermatogonia per se have their own differentiation stage, if this 
stage is included, the period required for maturation is about 42 
days in total) . Then, testes of postnatal mice are collected per 
day after birth to verify the expression of "Tespec PRO-1" and "Tespec 
PRO-2". This reveals at what stage of differentiation the genes are 
expressed in the sperm, or whether the genes are expressed in nurse 
cells (e.g. Sertoli's cells and Leydig' s cells) in the testis. 

On one hand, there exists a mutant mouse W (White spotting) that 
has a defect in chromosome 5 (Besmer, P. et al . (1993) Dev. Suppl . , 
125-137) . This mutant mouse has; a defect in c-kit, which is a receptor 
tyrosine kinase and expressed in the spermatogonia and spermatocytes. 
The mutant mouse has a deficiency in germ cells (complete deficiency) 
or a differentiation insufficiency, (partial deficiency) at the 
stages after spermatogonium, though it has normal nurse cells such 
as Sertoli's cells and Leydig' s cells in the testis . Thus, the 
expression of "Tespec PRO-1" and "Tespec PRO-2" were verified in the 
testis of the mutant mice W/Wv. 

RT-PCR was performed by using,. as templates, cDNAs prepared 
from total RNAs isolated from mouse testes 4 days , 8 days , 12 days, 
18 days, and 42 days after birth, and from testes of three wVWv mice 
56 days after birth.. In this RT-PCR experiment, cDNAs from adult 
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mouse testis, and liver were also used. Primers used were the "Tespec 
PRO-l"-specif ic primer and "Tespec PRO-2 "-specif ic primer described 
above in Example 5 . In the same manner as described in Example 5, 
40 cycles (29 cycles for G3PDH) of PCR was conducted. 
5 The result of RT-PCR demonstrate that expression levels of 

"Tespec PRO-1" and "Tespec PRO-2" were elevated in the testis 18 days 
after birth and later; neither gene was expressed at all before 12 
days after birth nor in the testis of W/Wv mutant mouse (Figure 8) . 
No expression of the genes was detected in the liver, a negative 
10 control. These results suggest that both "Tespec PRO-1" and "Tespec 
PRO-2" are expressed not in the nurse cells such as Sertoli's cells 
and Leydig's cells', but in germ cells, and that their expression 
levels are elevated in the spermatocytes differentiated from germ 
cells or in the spermatids after meiosis. 

15 : ; 

Example 7. Cloning of full-length cDNA of human "Tespec PRO-2" 

Human "Tespec PRO-2" cDNA was cloned, based on the nucleotide 
sequence of mouse "Tespec PRO-2". Human testis poly A+ RNA (CLONTECH) 
was converted into cDNA by using the reverse transcriptase 

20 Superscript II (GIBCO) and (dT) 30 VN primer . PCR was carried out, by 
using the cDNA as a template as. well as using No9-G and No9-Q primers 
derived from mouse "Tespec PRO-2". Polymerase used was AmpliTaq Gold 
and the thermal cycling profile of the low stringency PCR was: 
pre-heat at 95°C for 12 minutes, 42 cycles of . den.atura.tion at .96 ° C: 

25 for 20 seconds, annealing at 55 °C for 20 seconds and extension at 
72°C for 30 seconds, and subsequent final extension at 72°C for 3 
minutes . 

The resulting RT-PCR product was sequenced directly to 
determine the nucleotide sequence. The result showed that this PCR 
30 product is a gene fragment of human "Tespec PRO-2", which exhibits 
' about 80% homology to mouse "Tespec PRO-2" in nucleotide sequence. 
Based on this nucleotide sequence, primers for 5' RACE , i.e. h-B (SEQ 
ID NO: 31/ 5 ' -AGA GGT CAC TGT CGA GCT GGG-3 ' ) and h-D (SEQ ID NO: 
32/ 5' -TGT GAA TAA TGA CCT TCT GCA C-3 ' ) , and primers for 3' RACE, 
35 i.e. h-A (SEQ ID NO: 33/ 5 ' — TTC AGC AAC ATC CAC TCG GAG A-3') and 
h-C (SEQ ID NO : 34/ 5 ' -AAG CAA GTG CAG AAG GTC ATT A-3 ' ) were generated 
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Nested 3' and 5' RACE was conduced by using human testis Marathon 
ready cDNA (CLONTECH) as a template, according to the manufacturer's 
instruction. As a result, a f ull-length-cDNA for human "Tespec PRO-2" 
was cloned successfully. The nucleotide sequence is shown in SEQ 
5 ID NO: 5 and the amino acid sequence thereof is shown in SEQ ID NO: 
.6. 

The human "Tespec PRO-2" cDNA consists of 1035 nucleotides and 
is predicted to encode 265 amino acids (Figure 9) . Homology between 
human and mouse "Tespec PRO-2" is 74.2% at the nucleotide level and 

10 69.8% at the amino acid level . The amino acid sequence of the human 
"Tespec PRO-2" is shorter than that of mouse "Tespec PRO-2" by 54 
residues at the C-terminus , and consequently, the human nucleotide 
sequence is longer in the 3 'non-coding region as compared with that 
of the mouse gene (Figures 10 and 11) . In addition, there is a region 

15 predicted to be a signal peptide at the N-terminus , and the C-terminal 
region is also rich in hydrophobic amino acids. The deduced amino 
acid sequence of human "Tespec PRO-2" contains a trypsin-f amily 
serine protease motif, "Trypsin-His" . The motif of "Trypsin-Ser" 
of this protein contains an amino acid residue (GIFKGDSGAPLV) that 

20 is deviated from the consensus sequence in this motif that consists 
of 12 amino acid residues 

( [DNSTAGC] - [GSTAPIMVQH] -X-X-G- [DE] -S-G- [GS] - [SAPHV] - 
[LIVMFYWH] - [LIVMFYSTANQH] ) (mouse "Tespec PRO-2" contains two amino 
acid residues deviated from the consensus sequence in this motif. that 

25 consists of 12 amino acid residues) . 

The result of database search demonstrates that no known genes 
or proteins exhibit significant homology to the human "Tespec PRO-2", 
at nucleotide and amino acid levels, revealing that this protein is 
a novel trypsin-family serine protease. 

30 

Example 8 . Chromosomal mapping of human "Tespec PRO-2" 

PCR was performed by using a human chromosome panel (CORRIELL 
CELL REPOSITORIES) as a template, a pair of primers, h-A and h-F (SEQ 
ID NO: 35/ 5 ' -CAT TGG TCG TTA CCC ACT GTG C-3'.) , and Advantage cDNA 
35 polymerase (CLONTECH) as polymerase . The thermal cycling profile 
of PCR was: pre-heat at '95°C for 1 minute, 3 7 cycles of denaturation 
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at 96 °C for 15 seconds, annealing at 60 ° C for 15 seconds and extension 
at 68°C for 30 seconds, and subsequent final extension at 68°C for 
3 minutes. The PCR reaction was subjected to electrophoresis on a 
1.5% Seakem GTG agarose (TaKaRa) . 
5 As the result of PCR, human "Tespec PRO-2" was mapped on 

chromosome 8 (Figure 12) . 

Example 9 . Cloning of full-length cDNA of the human "Tespec PRQ-3" 
gene 

10 Human testis poly A+ RNA (CLONTECH) was converted into cDNA by 

using the reverse transcriptase Superscript II (GIBCO) and (dT) 30 VN 
primer. RT-PCR was carried out by using the cDNA synthesized as a 
template, and the primer pair of PROl-E (SEQ ID NO: 36/ 5 ' -ATT CTC 
AAT GAG TGG TGG GTT CT-3 1 ) and PR01-D (SEQ ID NO: 37/ 5 ' -CCA GCA CAC 

15 AGC ATA TTC TTG G-3 1 ) that are synthesized on the basis of the 
nucleotide sequence of mouse "Tespec PRO-1". The low stringency PCR 
was performed using the polymerase AmpliTaq Gold and the thermal 
cycling profile of: pre-heat at 95°C for 12 minutes, 5 cycles of 
denaturation at 96°C for 20 seconds, annealing at 50°C for 20 seconds, 

20 and extension at 72 °C for 45 seconds, and subsequent 35 cycles of 
denaturation at 96°C for 20 seconds, annealing at 60°C for 20 seconds, 
and extension at 72°C for 45 seconds, and final extension at 72°C 
for 3 minutes. 

- The RT-PCR product was purified by gel filtration and then its 
25 . nucleotide sequence was determined. The sequence analysis has 
revealed that this product is a gene fragment encoding a 
trypsin-family serine protease. The translation of this gene 
fragment revealed that it contained a "Trypsin-His" motif. A 
database search for the nucleotide sequence of this gene fragment 
30 showed that it overlaps in part with the sequence of a human EST 
(AA781356, aj25c04.sl Soares-tes tis-NHT Homo sapiens cDNA clone 
1391334 3', mRNA sequence.). Translation of this EST revealed the 
presence of a "Trypsin-Ser" motif in the amino acid sequence . Then, 
on the basis of the nucleotide sequence of the gene fragment obtained, 
35 primers were prepared: hPR03-B (SEQ ID NO: 38/ 5 ' -GGA AAC AGC TCC 
TCG GAA TAT AAG C-3 '') and hPR03-D (SEQ ID NO: 39/ 5 ' -TGG ATG GGC TAG 
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TTA AGT CGT TGG T-3 1 ) for 5'RACE, and hPR03-A (SEQ ID NO: 40/ 5 ' -TTC 
GAG GGA AG A ACT CGG TAT TC-3 1 ) and hPR03-C (SEQ ID NO: 41/ 5 ' -TGT 
GAA AAC GGA TCT GAT GAA AGC G-3 ' ) for 3' RACE . Nested RACE was 
conducted by using human testis Marathon ready cDNA (CLONTECH) as 
5 a template , according to the manufacturer's instruction to clone a 
full-length cDNA. The product obtained by the RACE was sequenced 
directly or after subcloned into the pGEM T easy vector. The 
nucleotide sequence is shown in SEQ ID NO: 9 and the amino acid 
sequence is shown in SEQ ID NO: 10. 

10 This novel human gene showed higher homology to mouse testis 

ESTs deposited in the database (AA497965, AA497934, AA497919, etc.) 
than to mouse "Tespec PRO-1" (Figure 14), though this gene was 
obtained using the primers generated on the basis of the nucleotide 
sequence of mouse "Tespec PRO-1". Thus, the gene was designated human 

15 "Tespec PRO-3" . 

The human "Tespec PRO-3" cDNA consists of 1123 nucleotides and 
is predicted to encode 352 amino acids (Figure 13) . This gene has 
a putative signal peptide at its N-terminus , and contains the 
"Trypsin His" and "Trypsin-Ser" motifs. In addition, cysteine 

20 residues that are predicted to form an intramolecular a disulfide 
bond are well conserved relative to other serine proteases. 

Example 10. Cloning of full-length cDNA of the mouse "Tespec PRO-3" 
gene . . .... . . - - . - 

25 Mouse "Tespec PRO-3", which is the mouse counterpart of the 

above-mentioned human "Tespec PRO-3" is considered to contain some 
of the nucleotide sequences of the above-mentioned ESTs, which are 
derived from mouse testis. Mouse ESTs for this gene, eight sequences 
in total, have been deposited in a database. Among them, four ESTs 

30 are derived from the testis , one is derived from the kidney and the 
remaining three are derived from cDNAs of unknown origins. Thus, 
primers were designed on the basis of these ESTs to conduct RACE using 
mouse testis Marathon ready cDNA as a template, and the full-length 
cDNA sequence of mouse "Tespec PRO-3" was cloned. 

35 On the basis of the nucleotide sequences of the mouse ESTs 

(AA497965, AA497934, AA497919, AA497949, AA271404, AA238183, 
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AA240375, and AA105229) , primers for 5' RACE, i.e. mPR03-B (SEQ ID ' 
NO: 42/ 5'-CAC CTA CTG CCA GGA TCT GTG G-3 ' ) and mPR03-D (SEQ ID NO: 
43/ 5 ' -GGC TAT TTT CTC AAT CCA CAG GGT A- 3 ' ) , and primers for 3' RACE, 
i.e. mPR03-A (SEQ ID NO: 4 4/ 5 ' —ATA GAG TGG GAG GAA TGC TTA CAG A- 3 ' ) 
5 and mPR03-C (SEQ ID NO: 45/ 5 1 -GCT ACG ATG CTT GCC AGG GTG- 3 1 ) , were 
generated. Nested RACE was conducted by using the mouse testis 
Marathon ready cDNA (CLONTECH) as a template, according to the 
manufacturer's instruction. The product obtained by RACE was 
sequenced directly or after subcloned into the pGEM T easy vector. 
10 The nucleotide sequence is shown in SEQ ID NO: 7 and the amino acid 
sequence is shown in SEQ ID NO: 8. 

The mouse "Tespec PRO-3" cDNA consists of 1028 nucleotides and 
it is predicted to encode 321 amino acids (Figure 15) . While the 
deduced amino acid sequence contains a "Trypsin-Ser" motif , it has 
15 the "Trypsin-His" motif that is deviated from the consensus motif 
consisting of 6 amino acids [LIVM] - [ST] -A- [STAG] -H-C at one amino 
acid residue (LTVAHC) . However, like mouse "Tespec PR0-2", some 
known trypsin-f amily serine proteases have sequences containing 
several amino acid deviation in the consensus sequence, and therefore 
20 mouse "Tespec PRO-3" is predicted to function as a protease. In 
addition, it has a hydrophobic region predicted to be a signal peptide 
at its N-terminus . Cysteine residues predicted to form an 
intramolecular disulfide bond are well conserved in the sequence 
relative to other serine proteases , 
25 Homology between human and mouse "Tespec PRO-3" is 70.2% at the 

nucleotide level and 59 . 6% at the amino acid level (Figures 16 and 
17). It was revealed that compared to human "Tespec PRO-3", mouse 
"Tespec PRO-3" is shorter in nucleotide sequence by about 100 
residues at the 5' end, and also shorter in amino acid sequence by 
30 about 30 residues at the N-terminus. 

Industrial Applicability 

Provided by the present invention are novel trypsin-f amily 
serine proteases and the genes encoding them. The proteins of the 
35 present invention were suggested to be involved, in sperm 
differentiation and maturation or in sperm function (fertilization) , 
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Thus, the proteases of the present invention and the genes thereof 
are expected to serve for developing new therapeutic or diagnostic 
agents for infertility and for developing new contraceptives. 


CLAIMS 


1. A protein comprising an amino- acid sequence selected from 
the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 
SEQ ID NO: 8, and SEQ ID NO: 10. 

2. A protein functionally equivalent to a protein comprising 
an amino acid sequence selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, and SEQ ID NO: 10, 
wherein said protein is selected from the group of (a) and (b) , 
wherein : 

(a) is a protein comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, 
SEQ ID NO: 8 , and SEQ ID NO: 10, wherein one or more amino acids are 
deleted, added, inserted and/or substituted with different amino 
acids ; and 

(b) is a protein enco ded by DNA that hybridizes to a DNA 
comprising a nucleotide sequence selected from the group consisting 
of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO : 5 , SEQ ID NO: 7, and SEQ 
ID NO : 9 . 

3. A partial peptide of the protein according to any one of 
claims 1 and 2. 

4. A fusion protein comprising the first protein according to 
any one of claims 1 and 2, fused with a second peptide. 

5. A DNA molecule encoding the protein according to any one of 
claims 1 to 3 . 

6. A vector into which the DNA according to claim 5 is inserted. 

7. A trans formant having the DNA according to claim 5 in an 
expressible form. 

8. A method for producing the protein according to any one of 
claims 1 to 3 , said method comprising the steps of: culturing the 
transformant according to claim 7, and recovering the expressed 
protein from the transformant or the culture supernatant thereof. 

9. A method of screening for a substrate of the protein 
according to any of claims 1 and 2, said method comprising the 
following steps of: 

(a) contacting a test sample with said protein; 
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(b) detecting the protease activity of said protein against the 

test sample; and 

(c) selecting a compound that is digested or cleaved by said 

protease activity . 

10 . A substrate of the protein according to any of claims 1 and 
2, wherein said substrate can be isolated by the method according 
to claim 9 . 

11. A method of screening for a compound capable of inhibiting 
the activity of the protein according to any of claims 1 and 2 , said 
method comprising the following steps of: 

(a) contacting the protein with the substrate of claim 10 in 
the presence of a test sample; 

(b) detecting the protease activity of the protein against the 
substrate; and 

(c) selecting a compound that reduces the protease activity 
relative to the protease activity detected in the absence of the test 
sample. 

12. A compound that inhibits the activity of the protein 
according to any of claims 1 and 2 , wherein said compound can be 
isolated by the method according to claim 11. 

13. An antibody that binds to the protein according to any of 
claims 1 and 2. 

14. A method for detecting or assaying the protein according 
to any of claims 1 and 2, said method comprising the steps of: 
contacting the antibody according to claim 13 with a test sample that 
is anticipated to contain the protein; and detecting or assaying 
formation of the immune-complex between the antibody and the protein. 

15. A nucleotide sequence specifically hybridizing to the DNA 
comprising the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 
7, and SEQ ID NO: 9, wherein the nucleotide sequence is at least 15 
nucleotide in length. 
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ABSTRACT . 

Two novel trypsin-f amily serine proteases specifically 
expressed in adult mouse testis ("Tespec PRO-1" and "Tespec PRO-2") , 
5 and a novel trypsin-f amily serine protease derived from mouse 
("Tespec PRO-3") have been isolated. Also, two novel trypsin-f amily 
serine proteases derived from human ("Tespec PRO-2" and "Tespec 
PRO-3") have been isolated. It has been suggested that these proteins 
are involved in sperm differentiation and maturation, and sperm 
10 functions (e.g. , fertilization). Therefore, these proteins are 
useful for development of novel therapeutics and diagnostics for 
infertility, as well as for development of novel contraceptives. 
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Figure 1 


10 20 30 40 50 60 70 80 90 

CC76CC7CAG7G776GAGC7CCCCA776C7GA7G7GCAGGCAAGCCGA76AAACGA7G6AA66ACAGAAGAACA6GCC7GT7GC7GCCAT 

IIKRWK0RR7GLLLPL 

100 110 120 130 140 150 160 170 180 

7GG7CC7CCTG77677TGG6GCA7G7A6C7CAC7GGCA7GGG7A7G7GGCCGGCGAA7GA67A6CAGATCCCAACAAC77AACAA7GC7T 
V L L L F G A C S S L A V V C G R R II S S R S Q Q L H N A S 

190 200 210 220 230 240 250 260 270 

CTGCTATCGT6GAAGGCAAACCTGCTTCTGCTATCGTGGGAGGCAAACCT6CAAACATCTTGGAGTTCCCCTGGCATGTGGGGATTATGA 
A 1VEGKPASA I V G G K P A N I L E F P W H V G I H N 

280 290 300 310 320 330 340 350 360 

ATCATGGTAGTCATCTCTGTG6GGGA7CTATTCTCAATGAGTG6T66GTTCTATCTGCA7CCCATTGCTTCGACCAACTAAACAACTCTA 
HGSHLCGG S I LNEWW V L S A S H C F 0 Q L N N S K 

370 380 390 400 410 420 430 440 450 

AATTGGAGATCATTCATGGCAGTGAAGACCTCAGCACAAAGGGCATAAAGTATCAGAAAG7GGACAA6TTATTCTTGCACCCAAAGTTTG 
L EI IHGT E DLSTK G IKYQKVOKLF LHPKF O 

460 470 480 490 500 510 520 530 540 

ATGACTGGCTCCTGGACAACGACATAGCTTTGC7CTTGCTCAAATCCCCATTAAACTTGAGTGTCAACAGGATACCTATCTGCACTTCAG 
D WLLDHD I ALLLLKSPL N L S V H R I P I C 7 S E 

550 560 570 580 590 600 610 620 630 

AAATC7C7GACA7ACAGGCA7GGAGGAAC7GC7GGG7GACA6GA7GGGGCA77AC7AA7AC7AG76AAAAAGGAGTCCAACCCACAA77C 
| S D I QAWR N CWV7G W G I 7N7 SEK6V QP7I L 

640 650 660 670 680 690 700 710 720 

7GCAGGCAG7CAAAG7GGA7C767ACAGA7GGGA77GG7G7GGC7ATA7777G7C7C7A77AACCAAGAA7A7GC767G7GC7G6GAC7C 

Q A V KV D LY RW DW C G Y l L S L LT KN M LC A G7 Q 

730 740 750 760 770 780 790 800 810 

AAGA7CC7GGGAAGGA7GCC7GCCAG6GCGACAG7GGA6GAGC7C7CG777GCAACAAAAA6AGAAACACAGCCA777667ACCAGGTGG 

0 p G K DACQGD SGG'ALV C NK KRN7A IW YQV6 

820 830 840 850 860 870 880 890 900 

GCA77G7CAGC7GGGGCA7GGGC7G7GGCAAGAAGAA7CTGCCA6GAGTA7ACACCAAGG7G7CACAC7A7G7GAGGTGGA7CAGCAAGC 

1 V S W 6 U G C G K K H L P G V Y T . K V S H Y V R * I S K 0 

910 920 930 940 950 960 970 980 990 

A6ACAGCGAAG6C66GGAGGCC7TA7A7G7A7GAGCAGAAC7C76CGTGCCC7776G7GC7CTC77GCCGGGCTA7CT7GT7CC7A7ATT 

7 A K A G R P Y II Y E Q N S A C P L V L S C R.A.I L F L Y F 

1000 1010 1020 1030 1040 
77G7AA7GTnC77C7AAGC7GA7GA77AAACG7GAGAC7GCC 
V U F L L 7 * 
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Figure 2 


10 20 30 40 50 60 70 80 90 

CCCACGCfiTNCGGTTfiTATCAATGTGGGCABfiGCATCAAGfiCAGGCACCACTGCACTSGAATGACAACATGATGCTCCCACTTCTAATTG 

M M I P L L I A 

100 110 123 130 1 40 1 50 1 60 170 180 

CACT6CTCATGGCTTCCAAGGGACAA6CTAAGGACCAGCAAGAATCAGTTCT6TGTG6CCACAGACCTGCCTTCCCAAACTCATCATGGC 
ILKA S K 6 Q A K 0 Q Q E S V L C G H R P A F P H S S W L 

190 200 210 220 230 240 250 260 270 

TGCCATTGCSGGAGCTGCTTGAGGTCCAGCATGGTGAGTTCCCATGGCAA6TGA6TATCCAGATGCTT6GGAAACACCTGTGT6GAGGCT 
P L R E L L £ V 0 H G E F P W Q V S I . Q M L G K H L C G G S 

280 290 300 310 320 330 340 350 360 

CCATCATCCACCGGT6GTGGGTTCTGACAGCA6CACACTGCTTCCCGAGAACCCTATTAGAACTGGTAGCAGTCAATGTCACTGTGGTCA 
I I H R V * V L T A A H C F P R T L L E L V A V N V T V V M 

370 380 390 400 410 420 430 440 450 

TG&GAATCAAGACTTTGAGTGACACCAACTTAGA6AGAAAACAAGTGCAGAA6ATCATTGCTCACAGAGACTACAAACCGCCC6ACCTT6 
Gt KTF SD TNLERKQVQ K I I AHRDYKPPDLD 

460 470 480 490 500 510 520 530 540 

ACAGCGACCTCTGCCTGCTCCTACTT6CCACGCCAATCCAATTCAATAAA6ACAAAATGCCCATCTGCCTGCCACAGAG6GAGAACTCCT 
S JL L C L L.L.L A T P I « F N K 0 K It P I C L P Q R E N S W 

550 560 570 580 590 600 610 620 630 

G6GACCGGTGCTGGATGTCA6AGTGGGCATATACTCATGGCCATGGTTCAGCCAAAGGCTCAAACATGCACCTGAAGAAGCTCAG6GTG6 
D R C W BSEWA YT H G H 6 S A K 6 S N U H L K K L ft V V 

640 650 660 670 680 690 700 710 720 

TTCAGATTAGCTGGAG6ACATGTGC6AAGAGGGTGACTCAGCTCTCCAGGAACATGCTTTGTGCTTGGAAGGAAGTGGGCACCAACGGCA 
Q I S W R T C A K R V T H L S R N M L C A W K E V G T H <j K 

730 740 750 760 770 780 790 800 810 

AGTGCCAGGGAGACAGCGGG6CACCCATGGTC7GTGCTAACTG6GAGACTCGGAGACTCTTTCAAGTGG6TGTCT7CAGCTGGGGCATAA 
C Q G P S 6 A P V V C A N W E T R R L F Q V G V F S W G I T 

820 830 840 850 860 870 880 890 900 

CTTCA6GATCCAGG6GGAGGCCAGGCATTTTTGTGTCTGTGGCTCAGTTTATCCCAT6GATCCTGGAGGA6ACACAAAGGGA6GGACGAG 
S G S R 6 R P 61 F VS V A Q F I P W I L E E T Q R E G R A 

910 920 930 940 950 960 970 980 990 

CCCTTGCCCTCTCAAAGGCtTCAAAAAGTCTCTTGGCTGGCAGTCCACGCTACCATCCCATATTGCTAAGCATGGGCTCTCAAATACTGC 
LA LS K A S KSLLAGSP R Y H P ILLSHGS Q I L L 

1000 1010 1020 1030 
TTGCTGCCATATTTTCTGATGAJAAA.TCAAATTGCTAAGCTCTG 
A A ( F S 0 35 K S H C * 
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Figure 3 
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Tespec PRO— 1 pep 
To3pec PRQ-2 pep 
h. prostas i n 
ra. acros i n prec 
m. trypsin prec 


MKRWKDRRTG LLLPLVLLLF GACSSLAHVC GRRHSSRSQQ LNNASAIVEG KPASA I VGGK 

M MLP LLIALLHASK GOAKDGG ESVLCGHRPA FPNSSWL PLRELLEVO 

MAQKGVLGPG QL6AVAILLY LGLLRSG — T GAEGAEAPCG 

MVEM : LPTVAVLV- LAVSWA — K ONTTCDGPCG 

MS -ALL I- LALVGAA— V AFPVDDDD — 


V APQA- -RITGGSSAV 

LRFRGNSGAG TRJVSGOSAQ 
-KIVGGYTCR 


60 
47 
52 
50 
31 


Tespec PRO-1 pep 
Tespec PRO-2 pep 
h. prostas i n 
m. acros i n prec 
m. tryps i n prec 


Te3pec PRO-1 pep 
Tespec PRO-2 pep 
h. prostas in 
m. acros in prec 
m. tryps in prec 


FDQ- 


PAN ! LEFPWH VG I MNHGS HLCGGS I LNEWWV 

HG EFPWQ VSIQilLGK— HLCGGS I IHRWWV 

AG QWPWQ VSt-TYEG VHVCGGS LVSEQWV 

LG AWPWM VSLOIFTSHN SRRYHACGGS LLNSHWV 

ES SVPY Q VSL— MAG YHFCGGS LINDOWV jVSA AHC|YKYR 

: * * **** ••• ** 


LNNSKL 

TLLELVAVNV 
FPSEHHK EAYEVKLGAH 
LTA AHCFDNKKKV YDWRLVFGAQ 
0 VRLGEH 


;lta ahc fpr- 

LSA AHCF 


Ell HGTEDLS TKGJKYQKVD KLFLHPKFDD WLLDN I 
TWUGIKTFS DTNLERKQVO Kl IAHRDYKP PDLDSl 

OLD SY SEDAKVSTLK Dl IPHPSYLO EGSQG 

E I EYGRNKPV KEPOOERYVO K I V I HEKYNV V7EGN| 

Nl N- — VL EGNEQFVDSA KJ IRHPNYNS UfTLDN | 

- • ■ • * ■ 


I ALL LLKSPLNLSV NRIPICTSE- 
.CLL LLATP I QFNK DKMPICLPQ- 
I ALL QLSRP I TFSR YIRP1CLPAA 
1 1 ALL XITPPVTCGN FIGPCCLPHF 
IHLI KLASPVTLNA RVASVPLP— 


106 
94 
103 
107 
76 


165 
153 
158 
167 
129 


Tespec PRO-1 pep 
Taspec PRO - 2 pep 
h. prostas i n 
m. acros i n prec 
m. trypsin prec 


ISD1GAWRN- CW/TGW31TN TSEKGVQPT I -LOAVKVDLY RWDWCG Y ILSL 

REN— SWDR- CWMSEWAYTH GHGSAKGSNM HLKKLRWQI SWRTCA K RVTQ- 

NASFPNGLH- CTVTGW5HVA PSVSLLTPKP -LQQLEVPL I 
KAGPPOIPHT CYVTGHGY I K EKAPRPSP-V -LMEARVDL I 

SSCAPAGTO- CL 1 SGWGNTL SN-GVNNPDL -LQCVDAPVL PQADCEA 
* . . . * *. 


SRETCNCLYN I OAKPEEPHF 
DLDLCNSTQW YNGR — — 
S YPGD 


214 

201 
216 
219 
178 


Tespec PRO-1 pep 
Taspec PRO-2 pep 
h. prostas i n 
m. acros i n prec 
m. trypsin prec 


LTKNMLCAGT- ODPGK 
LSRNMLCAWK EVGTN 


DACQG DSGGAQ71 CNK KRNTAIWYQV GIVSWGMGCG KKNLPGVYTK 
GKCQG DSGAPMVl CA- NWETRRLFQV GVFSWGITSG SRGRPGIFVS 
VOEDMVCAGY VEG.GK DACQG DSGGPLSICPV E- 


_ -G-LWYLT GIVSWGDACG ARNRPGVYTL 

VTSTNVCAGY PEGK 1 1575155 DSGGPLM CRD NVDS-PFVW GJTSWGVGCA RAKRPGVYTA 
ITNNMICVGF LEGGKlD SCQG DSGGPW|C NG ELQG-I VSWGYGCA GPDAPGVYTK 


274 
260 
273 
278 
231 


Tespec PRO-1 pep 
Tespec PRO-2 pep 
h. prostas i n 
m. acros i n prec 
ra. trypsin prec 


VSHYVRWISK QT — 
VAQF I PW I LE ET— 
ASSYASWIQS KVT- 
TWDYLDW 1 AS K1GPNALHLI 

VCNYVDWIQN Tl 

... ** - 


AKAGRPYMYE QNSACPLVLS C-R — 
QREGRALALS KASKSLLAGS P— RYH- 


~ELQ PRWPQTOES -QPOSNLCGS HLA- 


-FSS APAQGL- 


QPATPHPPTT RHPHVSFHPP SLRPPWYFQH LPSRPLYLRP 


308 
296 
320 
338 
243 


Tespec PRO—1 pep 
Tespec PRO-2 pep 
h. prostas in 
m. acros i n prec 
m. tryps i n prec 

Tespec PRO-1 pep 
Tespec PRO-2 pep 
h. prostas in 
m. acros i n prec 
ra. tryps i n prec 


— A1L- 
— PIL- 
LRPIL- 


-FLYF 
-LSMG 
-FLPL 


-VWFL— 
-SOIL— 
-GLALG- 


LAA1F- 

LL 


LRPLLHRPSS TQTSSSLMPL LSPPTPAQPA SFT I ATQHMR HRTTLSFARR LQRL i EALKH 


-SDDKSNG 


-SPW- 


-LSEH 


RTYPMKHPSQ YSGPRNYHYR FSTFEPLSNK PSEPFLHS 
. ADN 


321 
312 
336 
398 
243 

321 
319 
343 
436 
246 
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Figure 4 


No9-P 
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Figure 5 
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poly A 
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l-n 
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22 Tespec PK0-1 ; 1-1 — 1-H — 1-in — 1-IY 

S3 Tespec PR0~2 ; 2-1 ~ 2-HI — 2-IV — 2-V — 2~VZ 

ff^ f His active domain 

m Asp residue 

E33 - Ser active domain 
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Figure 6 


Tespec pRO-2 specific primer Tespec PRO-1 specific primer 

29(~) 17(+)96(+) NuO-A No9-B 

Tespec PRO-2 Tespec PRO-1 



^ ^^^^ ^ 


pol: 



poly A poly A. 


1 2 3 4 5 6 7 8 9 10 1.1 1 2 3 4 5 6 7 8 9.10 11 




No9-A + No9-B 

(Tespec PRO-1 specif ic.);!. 


29(-) + 17(+) 95(+) 

(Tespec PRO-2 specific) 

1 2 3 4 5 6 7 8 9 10 11 



G3PDH 



Tespec PRO-1 probe 


Tespec PRO-2 probe 
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Figure 8 


1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 



No9-A + No9-B 

(Tespec PRO-1 specific) 


29(-) + 17(+)95(+) 

(Tespec PRO-2 specific) 


1 2 3 4 5 6 7 8 9 10 11 
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Figure 9 


10 20 30 40 50 60 70 80 90 

CTG7GGC7G6CATG77G7CAGCTCTGGC76GAGGCAAAGG77T66CAA7777G6AC7GGAA77GACAAGAA6A7GTrCCAGC7TC7AA7T 

H F Q L L 1 

100 110 120 130 140" 150 160 1 70 180 

CCCCTGCTTTT6GCACTCAAGG6ACATGCCCAGGACAATCCA6AAAACGTACAATGT66CCACAGGCCTGCTTTTCCAAACTCSTCAT6G 
P L L L A L K 6 H A Q D It P E N V Q C 6 H ft P A F P N S S ¥ 

190 200 210 220 230 240 250 260 270 

7TACCA777CA7GAACGGCTTCAA6TCCA6AA7GGTGA67GCCCGTG6CAA67GAG7A7CCAGATG7CAC6GAAACACC7CT6TG6A66C 
L P F H E ft L Q V Q N G E C P If Q V S I Q U S ft K H L C 6 6 

280 290 300 310 320 330 340 350 360 

TCAATCTTACATTGGT66TGG6TTCTGACA6CCGCACACTGCTTCCGAAGAACCCTATTAGACATGGCCGTG6TAAATGTCACTGTGGTC 
S I L H W W W V L T A A H C F R Ft T L L D M A V V N V T V V 

370 380 390 400 410 420 430 440 450 

ATGGGAACGAGAACATTCAGCAACATCCACTCGGAGA6AAAGCAAGTGCAGAAGGTCATTATTCACAAAGA7TACAAACCGCCCCAGCTC 
H 6 7 RTF S H I H S E R K 0 V Q X V I 1 H K D YK PPQ L 

460 470 480 490 500 510 520 530 540 

6ACAGTGACCTCTC7CTGCTTCTACTTGCCACACCAGTGCAATTCAGCAAT7TCAAAATGCCTGTCTGCCTGCAGGAG6AGGASA6GACC 
D S D L S L LL L A T P V Q F S « f K II P V C L Q E E E R T 

550 560 570 580 590 600 610 620 630 

TG6GACTG6TGTTGGAt6GCACAGT6GGTAACGACCAATGG6TATGACCAATATGATGACTTAAACATGCACCTGGAAAAGCTGA6A6TG 
WD W C W M A QW V TT N G Y 0. Q Y D D L H M H L E KL RV 

640 650 660 670 680 690 700 710 720 

6T6CA6ATTAGCCGGAAAGAATGTGCCAAGAGG6TAAACCA6CTGTCCAGGAACATGATT7GTGCTTCGAACGAACCAGGCACCAATGGT 
V Q I S R K E C A K R V N Q L S R H M l C A S N E P G T N 

730 740 750 760 770 780 790 800 810 

ATCnCAAGGGA6ACAGTGGGGCACCTCTGGTTTGTGCTATTTATGGAACCCA6AGACTCTtCCAAGTGGGTGTCTTCAGTGG6GGCATA 
I F K 6 P S G A P \, V C A I YG TQ RLF(IVGVFS6G1 

820 830 840 850 860 870 880 890 900 

AGATCTGGC7CCAGGGG6AGACCT66TATGTTTGTGTCT6TGGCTCAAT7TATTCCATGAAGCCAGGAGGAGACAGAAAAGGA66G6AAA 
RSGSRGRPGHFVSVAQFIP* 

910 920 930 940 950 960 970 980 990 

6CCTACACCATAATCTCAGGATCCACGAGAA6CCGAGAAGCTCACTGG7GTGTGTTCCTCA6TACCCCTTCTTGCTAGGA7TGGGGTCTC 

1000 1010 1020 1030 
AAATGC7GCTGG«;ACCATGTT7ACCGG7G^JJJAACCTAACYRCW 
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Figure 10 

h. Tespsc HO-2 
m.Tesp3cHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h. Tespec HO-2 
m.Tesp2cHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h. Tespec HO-2 
m.TespecPSO-2 

h. Teepee HO-2 
m.lespecHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h. Tespec HO-2 
ra.TespecHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h.. Tespec ERO-2 
m.TespecHO-2 

hi Tespec HO-2 
m.TespecHO-2 

h. Tespsc HO-2 
ra.TespecHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h. Tespec HO-2 
m.TespecHO-2 

h. Tespsc HO-2 
m.TesperHO-2 

h. Tespec HO-2 
m.TespecHD-2 ■ 

h. Tespec HO-2 
m.Te3?ecHO-2 

h. Tespec HO-2 
m.TespacHO-2 
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Figure 11 


h. Tespcc PR0-2 pop 
m. Tospec PR 0-2 pep 


MFOLLIPLLL AL KGHA QONPENVQCG HRPAFPNSSW LPFHERLQVQ NGECPWQVSI 56 

M MLPLLI ALLMASKGQA KDQQESVLCG HRPAFPNSSH LPLRELLEVQ HGEFPWQVSI 57 

* ** * * *.* ** irno 4 r»»»»M° » ** **** ** ****** 


h. Tespec PRO- 2 pop 
m. Tespec PRO- 2 pep 


IUWWV pTAAHC| FRi 
I1HRWWV T-T AAHCj FPI 


GMSRKHLCGG SI 
QMLGKHLCGG SIIHR1 


RTLL DMAWNVTW HGTRTFSN I H SERKOVOKVI 116 
PRTLL ELVAVNVTW MGIKTFSDTN LERKQVQK1 I 117 
**** ■ ****** ** ***■ *******.* 


h. Tospec PRO- 2 pep 
m. Tespec PR 0-2 pep 


IHKDYKPPQL 'DSULSLLIXA TPVGFSNFKM PVCLQEEERT WDWCWHAQWV TTNGYDGYDD 176 
AHRDYKPPOL DSgLCLLLLA TPIQFNKDKM PICLPQRENS HDRCWMSEWA YTHGHGSAKG 177 


h. Tespcc PRO-2 pep 
m. Tospec PRO-2 pep 


LNMHLEXLRV VOISRKECAK RVNQLSRNM 1 CASNEPGTNG IFKGDSGAPL VCA1 YGTQRL 


SNKHLKKLRV VGISWRTCAK RVTQLSRNML CAWKEVGTNjG KCQGDSGAPM VpAMWETRRL 

* ** 


1>*1 ^ **** -f ** * *** ** ****** ** * t i t * 1 


* * ** ** *** 


236 
237 


h. Tespec PRO-2 pep 
m. Tospec PRO-2 pep 


FQVGVFSGG I RSGSRGRPGM FVSVAOFIP- 
FQVGVFSWGI TSGSRGRPGI FVSVAQF1 PH¥ 
******* ** ******** ********* 


1 LEETOREGR ALALSKASKS LLAGSPRYHP 


265 
297 


h. Tospec PRO-2 pep 
m. Tospec PRO-2 pop 


ILLSKGSQIL UVAIFSDDKS NC 


265 
319 
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Figure 13 


10 20 30 40 50 60 70 80 90 

SGCCTCTGTCACCCCCSGfiCCCACAGCACAfiCCCAGGGCCATfiCTCCTGTTCTCAGTGTTGCTSCTCCTGTCGCTGGTCACGfiGAACTCA 

M L L F S V L L L L S L V T G T Q 

100 110 120 130 140 150 160 170 180 

GCTCGGTCCACGGACTCCTCTCCCAGAGGCTGGAGTGGCTATCCTAGGCAGGGCTAGGGGAGCCCACCGCCCTCA6CCCCGTCATCCCCC 
L G P R T P L P E A G V A I L G R A R 6 A H R P Q P R H P P 

190 200 210 220 230 240 250 260 270 

CAGCCCAGTCAGTGAATGTGGTGACAGATCTATTTTCGAGGGAAGAAGTCGGTATTCCAGAATCACAGGGGGGATGGAGGCGGAGGTGGG 
S P V S E C G D R S I F E G R T R Y S fl I T G G M E A E V G 

280 290 300 310 320 330 340 350 360 

T6AGTTTGCGTGGCAGGTGAGTATTCA6GCAAGAAGTGAACCTTTCTGTGGCGGCTCCAtCCTCAACAAGTGGTGGATTCTCACTGC6GC 
EFPW QVS I O A RSE PFCG G S I L NKWW I L T A A 

370 380 390 400 410 420 430 440 450 

TCACTGCTTATATTCC6AGGAGGTGTTTCCA&AAGAACTGAGTGTC6TGCTGGGGACCAACGACTTAACTAGCCGATCCATGGAAATAAA 
H C LY SEELFPEELSVVLGTNO LTSP SME IK 


460 _ 470 480 490 500 510 520 530 540 

GGAGGTCGCCAGCAfCATTCTTCACAAAGACTTTAAGAGAGCCAACATGGACAATGAGATTGCCTTGCTGCTGCTGGCTTCGCCCATCAA 
EVAS I I LR KDFKR AN MD N D I ALLLLASP IK 

550 560 570 580 - 590 600 610 620 630 

GCTCGATGACCTGAAGGTGCCCATGTGCCTCGCCACGCAGCGCGGCCCTGCCACATGGCGCGAATGCTG6GTGGCAG6TTGG6GCCAGAC 
L 0 0 L K V P I C L P T Q P G P A T W R E C W V A G W G « T 

640 650 660 670 680 690 700 710 720 

CAATGCTGCTGACAAAAACTCTGTGAAAACGGATCTGATGAAAGCGCCAATGGTCATCATGGACTGGGAGGAGTGTTCAAAGATGTTTCC 
N A A O K N S V K T D L M K A P H V I M D W E E C S K M F P 

730 740 750 760 770 780 790 800 810 

AAAACTTACGAAAAATATGCTGTGTGCCGGATACAAGAATGAGAGCTATGATGCCT6CAAGGGTGACAGTGGGGGGCCTCTGGTCTGCAC 
K L T K M 11 L C A G Y K M E S Y DACK GDSGGPLV C T 

820 830 840 850 860 870 880 890 900 

CCCAGAGCCTGGTGAGAAGTGGTACCAGGTGGGCATCATGAGCTGaGGAAAGAGCTGTGGAGATAAGAAGACCCCAGGGATATACACCTC 
PEP G EKWYQVG I I S W GKSCGD K NT P G I YT S 

910 920 930 940 950 960 970 980 990 

GTTGGTGAACTACAACCTCTGGATCGAGAAAGTGACCCAGCrAGGAGGCAGGCCCTTCAATGGAGAGAAAAGGAGGACTTCTGTCAAACA 
L V N Y MLW I E K V T Q L 6 G R P F H A E K R R T S V K Q 

1000 1010 1020 1 030 1040 1050 1060 1070 1080 
GAAACCTATGGGCTCCCCAGTCTCGGGAGTCCCAGAGCCAGGCAGCCCCAGATCCTGGCTCCTGCTCTGTCCCCTGTCCCATGTGTTGTT 
KPMGSPVS GVPEP G S PRSWLL LCPL SHVLF 

1090 1100 1110 1120 1130 
CA6AGCTATTTTGTACTGATAATAAAATAGAGGCTATTCTTTC 
R A I L Y * 
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Figure 14 


(Human Tespec PRO-1) 
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Mouse Tespec PRO-1 


63% 


Mouse Tespec PRO-3 
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Figure 15 


10 20 30 40 50 60 70 80 90 

GTCAGCCTGGCCTCCAACACACA6CACAGCCA6AGCCATGATCCTSCCCTCCATCCT6CTACTT6TTGCCCACACCCTGGAAGCAAATGT 

M I L P S I LLLVA H'TLE-ANV 

100 110 120 130 140 150 160 170 180 

T6AGTGTGGTGTGAGACGCCTGTATGATAGCAGAATTCAATACTCCAGGATCATAGAAGGGCA6GAGGCTGA6CT6GGTGAGTTTCCATG 

E C G V R P L YDS R I Q Y S R I I E G Q E A E L G E F P W 

190 200 210 220 230 240 250 260 270 

GCAGGTGAGCATTCAGGAAAGTGACCACCATTTCTGCGGCGGCTCCATTCTCAGTGA6TGGTGGATCCTCACCGTGGCCCACTGCTTCTA 
OVSI QE S D HHFCG GS ILSEWW I L T V A H C F Y 

280 290 300 310 320 330 340 350 360 

TGCTCAGGAGCTTTCCCCAACAGATCTCAGAGTCAGAGTGGGAACCAATGACTTAACTACTTCACCCGTGGAACTAGAGGTCACCACCAT 

A Q ELSPTDLRVRVGTNDLTTS PVELE VTT I 

370 380 390 400 410 420 430 440 450 

AATCCGGCACAAAGGCTTTAAACGGCTGAACATGGACAACGACATTGCCTTGTTGCTGCTAGCCAAGCCCTTGGCGTTCAATGAGCTGAC 

I R H K G F K R L N M D A L L L L A K P L A F N E L T 

460 470 480 490 500 510 520 530 540 

GGTGCCCATCTGCCTTCCTCTCTGGCCCGCCCCTCCCAGCTGGCACGAATGCTGGGTGGCAGGATGGGGCGTAACCAACTCAACTGACAA 
VP I CLPL WP APP SWH EC WVAG WGVT NSTD K 

550 560 570 580 590 600 610 620 630 

GGAATCTATGTCAACGGATCTGATGAAGGTGCCCATGCGTATCATAGAGTGGGAGGAATGCTTACAGATGTTTCCCAGCCTCACCACAAA 
E S M S T D L M K V P M R I I E ffl EE CLftM F PSLTTN 

640 650 660 670 680 690 700 710 720 

CATGCTGTGTGCCTCATATGQTAATGAGAGCTACGATGCTTGCCAGGGTGACAGTGG6GGACCGCTTGTCTGCACCACAGATCCTGGCAG 

M L C A S Y G N E S Y DACQ GDSGGPLV C T T D P G S 

730 740 750 760 770 780 790 800 810 

TAGGTGGTACCAGGTGGGCATCATCAGCTGGGGCAAGAGCTGT6GAAAAAAAGGCTTCCCAGGGATATATACTGTATTGGCAAAGTATAC 

R W Y Q V G I I S W G K S C G K K G F P G I Y T V L A K Y T 

820 830 840 850 860 870 880 890 900 

CCTGTGGATTGAGAAAATAGCCCAGACAGAGGGGAAGCCCCTGGATTTTAGAGGTCAGAGCTCCTCTAACAAGAAGAAAAACAGACAGAA 

L W I E K I A Q T E G K P L D F R 6 OSS S ti K K K N R Q N 

910 920 930 940 950 960 970 980 990 

CAATCAGCTCTCCAAATCCCGAGCCCTGAACTGCCCCCAAAGCTGGCTCCTGCCCTGTCTGCTGTCCTTTGCACTGCTTAGAGCCTTGTC 

N Q L S K S P A L N C P 0 S W L L P C L L S F A L L R A L S 

1000 1010 1020 1030 

CAACTGGAAATAAAACAATGCAGTCTCTGATCCACCCT 
N W K * 
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Figure 16 
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m. Tespec PRO- 3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 

m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PR0^3 
h. Tespec PRO-3 



A — jCHCACAG CACAGCC-AG I GCCATG ! 1°C 
GGQC • CACAGCACAGCC ; AG : GCCATGPTC 


CTC :c CTC :a 
CTG n CTC dG 


IGGAA CTCAGCTCGG TCCACGGACT ^^TCCi 


oStccta GGCAGGGCTA GGGGAGCCCA CCGCGCTCAG CCCCGTCATC 


fGA j 

cccccagccc agtcaotgaK 


tgtggtg rc k ga: 

TGTGGTCUft gai 


aSJaga AHn^AfrAHr] 
dG WGA An rgsgTAiTn 


53 
56 


88 
116 


90 

176 


134 
236 


m. Tespec PRO-3 
h. Tespec PRO-3 


ccAuvrar 

CCACWTCAC 


SlGGG 

]GGG>VTt6 GAGGdjCAGfc 


GAGGC I GAG C TGGGTGAGTT TCC ! TGGCAG GTGAG:ATTC 


TGGGTGAGTT TCgfrGGCAG GTGAGntATTC 


194 
296 


m. Tespec PRO-3 
h. Tespec PRO-3 


AGC J AAG TGA CC A : C S TTTC TG : GGCGGCT CCAT I CTCA 3 TC AGTGGTGG AT : CTCA 
AGq c fcAGfi AG TqftHU jl I I C TGTOGGCGGCT CCATRCTCAfi C AIAGTGGTGG ATBCTCACT 


254 
356 


m. Tespec PRO-3 
h. Tespec PRO-3 


TGGCCCACTG CTT I TAT [ QT C AGGAGCT IT Z£ CCA U AGA [ CT C A.G !GTC VGAC TGGG 
( GGC T CACJGCTT I TAT I CC ( A GGAGCT ' T nCCASAAGA ICTCAGfflGTC pTGOjrGGGJjA] 


3sH 
3*1 


314 
416 


m. Tespec PRO-3 
h. Tespec PRO-3 

m. Tespec PRO-3 
h. Tespec PRO-3 

m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 


CCA/ I GACTT AACTAT 

cxMdcAcrr aactapc 


GGAA^ {taI— IGAGGT 
h^AC(GAiiGJ_ 


CARCAT 


I CACA 
CACAJ 


^ACjCtCTTTAA frofficrcfrAC ATGGACAAj^G ACATTGCCTT GftTGCTGCTft EtiA^GCCCr 
icc fkar atggacaaMg ArATTnrmr dtrr.cTGCTC lsiTTqCC£IP 


ggckjtoa tga:ctga:g gtgcccatct GCCTFCcrcr acGCCCG:c ca:o 
(aATg drak tgahctgaWg gtgcccatct gcctt jcccac gc aIgcccghc ccth cca 


GAfccF 
CAptl 


GGCSCGAATG CTGGGTGGCA GG 1 TGGGGC j TA HCCAA 
GG q t KGAATG CTGGGTGGCA GGfflTGGGGCX A qftCCAAt T 


ntcTvAlCTGACAAb dA^CjjffcJ- 

rdc Ir dcTGACAAR fymcrm<& 


371 
476 

431 

536 


491 
596 


551 
656 


m. Tespec PRO-3 
h. Tespec PRO-3 


dAACGGATCT GATGAA : G [ G CC : ATG :GT A. TCAT > GA : TG GGAGGA I TG " trWA : AGATGT 
AtAACGGATCT GATGAAH G OG CCWTGF TC IA TCATHGAPTG GGAGGAH TCy frpWAGATGTI 


611 
716 


m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 

m. Tespec PRO-3 
h. Tespec PRO-3 


rTcdc^dcT^ccAHmk 


ATGCTGTGTG CC fC ATA TGG 1 AATGAGAGC TA 


GATGt 


itccW a Mct ItIaccaMaaaI t Iatgctgtgtg ccb qatac aa qaatgagagc taB gatg 

lie 
U 


SC ( AGGGTGA CAGTGGGGG V JO 
IpnaACGGTGA CAGTGGGGGfa 


GTGT GCACCWCAGA McCTGGfcAGT 0(jGTGGTACC 
GTCT GCACCDCA6A prCTGGn "GAG BflfiTCiGTftCC 


AGGTGGGCAT CATCAGCTGG G<±UAGAGCT GTGGA^IaUaA kGC&TlCCCA GGGATATAIA 
AflCTftGRCAT CATC AGfTGG GGE liAGAKfT GTGGAH aMaA bAAlCR OrrrA ffldATATA flfll 


671 

776 


731 
836 

791 
896 


m. Tespec PRO-3 
h. Tespec PRO-3 


3rCTAjTTGGt A^AliTAHSic atirGGAThjci AGAAAtfrKdc CCAG^C 

dcTc qrTGGr qAAffrAOAt yc ctotcgatpg AaAAgrbjc ccac 


851 
955 


m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 


flAG AGfi AAtft HGAG 


ic t ra 31 C AA - 


A A 

3aa a 


a caai cag rrc 

C TCCCCAGTCT 


AG AG^Tc(a|^GAG bfijTCT>A)C AA 

p|c 

GCCCcbCTkcjcTGGCTCC TGC^CTGTCjr (fcTCTC 


TCCaCMtCCC idAGtc&GAA 
CGGGU pfCCC HAGfi deCAGG 


GCCCCVcTa r( CTGGCTCC GC [ CTGTcfc UCTGTC 


910 
1013 


970 
1072 


m. Tespec PRO-3 
h. Tespec PRO-3 


lK GAGlllflVriV. CAkCTGbiftA TAAAAtfkkliG" 
GAGCT AfTjltrtr GT|ACTGfi jftA TAAAAM a|- 


tca tccvccct 
— » tMcI-tttc 


1028 
1123 


! i v: i 
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Figure 17 


m. Tespec PRO- 3 
b. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 


IfliLiarfLLlV — -A 

IMljljft^lLLUL SLVTGTQLGP 


ILGRARGAHR PQPRHPPSPV S 


dYSRI IE ( CEAEl GEFP NOV SIC ESDH 1- FCGGSIL SE ^ILTWAW 

liijrG IdfoikiFFP wnvsiqARSE faajihic — -■■ 


XDNDIALL L 


tJGTNDLT tTSP^L-^T iffWttJdFKR ifiSi 
ikdlDLlJsPS^IKiBM dTlfaifeyB HNMnNDTAl I HA 



27 
60 


87 
120 


146 
180 


m. Tespec PRO-3 
h. Tespec PRO-3 


m. Tespec PRO-3 
h. Tespec PRO-3 


ECW VAGWI 


DLM KV 
IfrDLM U 


IE HE EC.OJlFPfjLT (IJNMLC 


esydac:gds ggplvcttep cspnyqvgii swgkscg 
esydaqwgds ggplvcip hp (e mwyovgii swgkscc 


FPGIYT 
1PGTYT3 


206 
240 


266 
3M 


m. Tespec PRO-3 
h. Tespec PRO-3 


TtfiMft-DFRG QSSSNlfflKNR QNNQlSdfk LN<j 
LiglHFN-AE KRRTSVMQKP MGSPMaGVgE PGS 


321 
352 
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JC08 Rec'd PCT/PTO 0 3MAY20W 

SEQUENCE LISTING 

<110> Chiaki Senoo et al . 

<12 0> Novel Trypsin Family Serine Proteases 


<130> 50026/027001 

<150> JP 1998-313366 
<151> 1998-11-04 

<160> 45 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 1033 

<212> DNA 

<213> Mus musculus 

<220> 
<221> CDS 

<222> (48) ... (1010) 
<400> 1 

cctgcctcag tgttggagct ccccattgct gatgtgcagg caagccg atg aaa cga 56 

Met Lys Arg 
1 

tgg aag gac aga aga aca ggc ctg ttg ctg cca ttg gtc etc ctg ttg 104 
Trp T,ys Asp Arg Arg Thr Gly Leu Leu Leu Pro Leu Val Leu Leu Leu 
5 10 15 

ttt ggg gca tgt age tea ctg gca tgg gta tgt ggc egg cga atg agt 152 
Phe Gly Ala Cys Ser Ser Leu Ala Trp Val Cys Gly Arg Arg Met Ser 
20 25 30 35 

age aga tec caa caa ctt aac aat get tct get ate gtg gaa ggc aaa 200 
Ser Arg Ser Gin Gin Leu Asn Asn Ala Ser Ala lie Val Glu Gly Lys 
40 45 50 

cct get tct get ate gtg gga ggc aaa cct gca aac ate ttg gag ttc 248 
Pro Ala Ser Ala lie Val Gly Gly Lys Pro Ala Asn lie Leu Glu Phe 
55 60 65 

ccc tgg cat gtg ggg att atg aat cat ggt agt cat etc tgt ggg gga 2 96 
Pro Trp His Val Gly lie Met Asn His Gly Ser His Leu Cys Gly Gly 
70 75 80 

tct att etc aat gag tgg tgg gtt eta tct gca tec cat tgc ttc gac 344 
Ser lie Leu Asn Glu Trp Trp Val Leu Ser Ala Ser His Cys Phe Asp 
85 90 95 
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caa eta aac aac tct aaa ttg gag ate att cat ggc act gaa gac etc 3 92 
Gin Leu Asn Asn Ser Lys Leu Glu lie lie His Gly Thr Glu Asp Leu 
100 105 110 115 

age aca aag ggc ata aag tat cag aaa gtg gac aag tta ttc ttg cac 440 
Ser Thr Lys Gly lie Lys Tyr Gin Lys Val Asp Lys Leu Phe Leu His 
120 125 130 

cca aag ttt gat gac tgg etc ctg gac aac gac ata get ttg etc ttg 488 
Pro Lys Phe Asp Asp Trp Leu Leu Asp Asn Asp lie Ala Leu Leu Leu 
135 140 145 

etc aaa tec cca tta aac ttg agt gtc aac agg ata cct ate tgc act 536 
Leu Lys Ser Pro Leu Asn Leu Ser Val Asn Arg lie Pro lie Cys Thr 
150 155 160 

tea gaa ate tct gac ata cag gca tgg agg aac tgc tgg gtg aca gga 584 
Ser Glu lie Ser Asp lie Gin Ala Trp Arg Asn Cys Trp Val Thr Gly 
165 170 175 

tgg ggc att act aat act agt gaa aaa gga gtc caa ccc aca att ctg 632 
Trp Gly He Thr Asn Thr Ser Glu Lys Gly Val Gin Pro Thr He Leu 
180 185 190 195 

cag gca gtc aaa gtg gat ctg tac aga tgg gat tgg tgt ggc tat att 680 
Gin Ala Val Lys Val Asp Leu Tyr Arg Trp Asp Trp Cys Gly Tyr He 
200 205 210 

ttg tct eta tta ace aag aat atg ctg tgt get ggg act caa gat cct 728 
Leu Ser Leu Leu Thr Lys Asn Met Leu Cys Ala Gly Thr Gin Asp Pro 
215 220 225 

ggg aag gat gec tgc cag ggc gac agt gga gga get etc gt't tgc aac 776 
Gly Lys Asp Ala Cys Gin Gly Asp Ser Gly Gly Ala Leu Val Cys Asn 
230 235 240 

aaa aag aga aac aca gee att tgg tac cag gtg ggc att gtc age tgg 824 
Lys Lys Arg Asn Thr Ala He Trp Tyr Gin Val Gly He Val Ser Trp 
245 250 255 

ggc atg ggc tgt ggc aag aag aat ctg cca gga gta tac acc aag gtg 872 
Gly Met Gly Cys Gly Lys Lys Asn Leu Pro Gly Val Tyr Thr Lys Val 
260 265 270 275 

tea cac tat gtg agg tgg ate age aag cag aca gcg aag gcg ggg agg 92 0 
Ser His Tyr Val Arg Trp He Ser Lys Gin Thr Ala Lys Ala Gly Arg 
280 285 290 

cct tat atg tat gag cag aac tct gcg tgc cct ttg gtg etc tct tgc 968 
Pro Tyr Met Tyr Glu Gin Asn Ser Ala Cys Pro Leu Val Leu Ser Cys 
295 300 305 

egg get ate ttg ttc eta tat ttt gta atg ttt ctt eta acc 1010 
Arg Ala He Leu Phe Leu Tyr Phe Val Met Phe Leu Leu Thr 
310 315 320 
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tgatgattaa acgtgagact gcc 1033 

<210> 2 
<211> 321 
<212> PRT 

<213> Mus musculus 


<400> 2 


Met 

Lys 

Arg 

Trp 

Lys 

Asp 

Arg 

Arg 

Thr 

Gly 

Leu 

Leu 

Leu 

Pro 

Leu 

Val 

1 




5 





10 





15 


Leu 

Leu 

Leu 

Phe 

Gly Ala 

Cys 

Ser 

Ser 

Leu 

Ala 

Trp 

Val 

Cys 

Gly Arg 




20 





25 





30 



Arg 

Met 

Ser 

Ser 

Arg 

Ser 

Gin 

Gin 

Leu 

Asn 

Asn 

Ala 

Ser 

Ala 

He 

Val 



35 





40 





45 




Glu 

Gly 

Lys 

Pro 

Ala 

Ser 

Ala 

He 

Val 

Gly 

Gly 

Lys 

Pro 

Ala 

Asn 

He 


50 





55 





60 





Leu 

Glu 

Phe 

Pro 

Trp 

His 

Val 

Gly 

He 

Met 

Asn 

His 

Gly 

Ser 

His 

Leu 

65 





70 





75 





80 

Cys 

Gly Gly 

Ser 

He 

Leu 

Asn 

Glu 

Trp 

Trp 

Val 

Leu 

Ser 

Ala 

Ser 

His 





85 





90 





95 


Cys 

Phe 

Asp 

Gin 

Leu 

Asn 

Asn 

Ser 

Lys 

Leu 

Glu 

He 

He 

His 

Gly 

Thr 




100 





105 





110 



Glu 

Asp 

Leu 

Ser 

Thr 

Lys 

Gly 

He 

Lys 

Tyr 

Gin 

Lys 

Val 

Asp 

Lys 

Leu 



115 





120 





125 




Phe 

Leu 

His 

Pro 

Lys 

Phe 

Asp 

Asp 

Trp 

Leu 

Leu 

Asp 

Asn 

Asp 

He 

Ala 


130 





135 





140 





Leu 

Leu 

Leu 

Leu 

Lys 

Ser 

Pro 

Leu 

Asn 

Leu 

Ser 

Val 

Asn 

Arg 

He 

Pro 

145 





150 





155 





160 

lie 

Cys 

Thr 

Ser 

Glu 

He 

Ser 

Asp 

He 

Gin 

Ala 

Trp 

Arg 

Asn 

Cys 

Trp 





165 





170 





175 


val 

Thr 

Gly 

Trp 

Gly 

He 

Thr 

Asn 

Thr 

Ser 

Glu 

Lys 

Gly 

Val 

Gin 

Pro 




180~ 





185 





"190 



Thr 

He 

Leu 

Gin 

Ala 

Val 

Lys 

Val 

Asp 

Leu 

Tyr 

Arg 

Trp 

Asp 

Trp 

Cys 



195 





200 





205 




Gly 

Tyr 

He 

Leu 

Ser 

Leu 

Leu 

Thr 

Lys 

Asn 

Met 

Leu 

Cys 

Ala 

Gly 

Thr 


210 





215 





220 





Gin 

Asp 

Pro 

Gly 

Lys 

Asp 

Ala 

Cys 

Gin 

Gly Asp 

Ser 

Gly 

Gly Ala 

Leu 

225 





230 





235 





240 

Val 

Cys 

Asn 

Lys 

Lys 

Arg 

Asn 

Thr 

Ala 

He 

Trp 

Tyr 

Gin 

Val 

Gly 

He 





245 





250 





255 


Val 

Ser 

Trp 

Gly 

Met 

Gly 

Cys 

Gly 

Lys 

Lys 

Asn 

Leu 

Pro 

Gly Val 

Tyr 




250 





265 





270 



Thr 

Lys 

Val 

Ser 

His 

Tyr 

Val 

Arg 

Trp 

He 

Ser 

Lys 

Gin 

Thr 

Ala 

Lys 



275 





280 





285 




Ala 

Gly Arg 

Pro 

Tyr 

Met 

Tyr 

Glu 

Gin 

Asn 

Ser 

Ala 

Cys 

Pro 

Leu 

Val 


290 





295 





300 





Leu 

Ser 

Cys 

Arg 

Ala 

He 

Leu 

Phe 

Leu 

Tyr 

Phe 

Val 

Met 

Phe 

Leu 

Leu 

305 





310 





315 





320 

Thr 

















<210> 3 
<211> 1034 
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<212> DNA 

<213> Mus musculus 


<220> 
<221> CDS 

<222> (69) ... (1025) 
<223> 


<221> misc_f eature 
<222> 10 

<223> n = A or C or G or T/U 
<400> 3 

cccacgcgtn cggttgtatc aatgtgggca gggcatcaag gcaggcacca ctgcactgga 60 
atgacaac atg atg etc cca ctt eta att gca ctg etc atg get tec aag 110 
Met Met Leu Pro Leu Leu lie Ala Leu Leu Met Ala Ser Lys 
15 10 


gga caa get aag gac cag caa gaa tea gtt ctg tgt ggc cac aga cct 158 
Gly Gin Ala Lys Asp Gin Gin Glu Ser Val Leu Cys Gly His Arg Pro 
15 20 25 30 


gec ttc cca aac tea tea tgg ctg cca ttg egg gag ctg ctt gag gtc 2 06 
Ala Phe Pro Asn Ser Ser Trp Leu Pro Leu Arg Glu Leu Leu Glu Val 
35 40 45 


cag cat ggt gag ttc cca tgg caa gtg agt ate cag atg ctt ggg aaa 2 54 
Gin His Gly Glu Phe Pro Trp Gin Val Ser lie Gin Met Leu Gly Lys 
50 55 60 


cac ctg tgt gga ggc tec ate ate cac egg tgg tgg gtt ctg aca gca 3 02 
His Leu Cys Gly Gly Ser lie lie His Arg Trp Trp Val Leu Thr Ala 

65 ~ 70 ~ 75 


gca cac tgc ttc ccg aga acc eta tta gaa ctg gta gca gtc aat gtc 
Ala His Cys Phe Pro Arg Thr Leu Leu Glu Leu Val Ala Val Asn Val 
80 85 90 


350 


act gtg gtc atg gga ate aag act ttc agt gac acc aac tta gag aga 
Thr Val Val Met Gly He Lys Thr Phe Ser Asp Thr Asn Leu Glu Arg 
95 100 105 110 


398 


aaa caa gtg cag aag ate att get cac aga gac tac aaa ccg ccc gac 446 
Lys Gin Val Gin Lys He He Ala His Arg Asp Tyr Lys Pro Pro Asp 
115 120 125 


ctt gac age gac etc tgc ctg etc eta ctt gee acg cca ate caa ttc 494 

Leu Asp Ser Asp Leu Cys Leu Leu Leu Leu Ala Thr Pro He Gin Phe 
130 135 .140 

aat aaa gac aaa atg ccc ate tgc ctg cca cag agg gag aac tec tgg 542 

Asn Lys Asp Lys Met Pro He Cys Leu Pro Gin Arg Glu Asn Ser Trp 
145 150 155 


gac egg tgc tgg atg tea gag tgg gca tat act cat ggc cat ggt tea 5 90 
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Asp Arg Cys Trp Met Ser Glu Trp Ala Tyr Thr His Gly His Gly Ser 
160 165 170 

gcc aaa ggc tea aac atg cac ctg aag aag etc agg gtg gtt cag att 63 8 
Ala Lys Gly Ser Asn Met His Leu Lys Lys Leu Arg Val Val Gin lie 
175 180 185 190 

age tgg agg aca tgt gcg aag agg gtg act cag etc tec agg aac atg 686 
Ser Trp Arg Thr Cys Ala Lys Arg Val Thr Gin Leu Ser Arg Asn Met 
195 200 205 

ctt tgt get tgg aag gaa gtg ggc acc aac ggc aag tgc cag gga gac 734 
Leu Cys Ala Trp Lys Glu Val Gly Thr Asn Gly Lys Cys Gin Gly Asp 
210 215 220 

age ggg gca ccc atg gtc tgt get aac tgg gag act egg aga etc ttt 782 
Ser Gly Ala Pro Met Val Cys Ala Asn Trp Glu Thr Arg Arg Leu Phe 
225 230 235 

caa gtg ggt gtc ttc age tgg ggc ata act tea gga tec agg ggg agg 830 
Gin Val Gly Val Phe Ser Trp Gly lie Thr Ser Gly Ser Arg Gly Arg 
240 245 250 

cca ggc att ttt gtg tct gtg get cag ttt ate cca tgg ate ctg gag 878 
Pro Gly lie Phe Val Ser Val Ala Gin Phe lie Pro Trp lie Leu Glu 
255 260 265 270 

gag aca caa agg gag gga cga gcc ctt gcc etc tea aag gcc tea aaa 926 
Glu Thr Gin Arg Glu Gly Arg Ala Leu Ala Leu Ser Lys Ala Ser Lys 
275 280 285 

agt etc ttg get ggc agt cca cgc tac cat ccc ata ttg eta age atg 974 
Ser Leu Leu Ala Gly Ser Pro Arg Tyr "His Pro lie Leu Leu Ser Met 
290 295 300 

ggc tct caa ata ctg ctt get gcc ata ttt tct gat gat aaa tea aat 1022 
Gly Ser Gin lie Leu Leu Ala Ala lie Phe Ser Asp Asp Lys Ser Asn 
305 310 315 

tgc taagctctg 1034 
Cys 


<210> 4 
<211> 319 
<212> PRT 

<213> Mus musculus 
<400> 4 

Met Met Leu Pro Leu Leu lie Ala 

1 5 
Ala Lys Asp Gin Gin Glu Ser Val 
20 

Pro Asn Ser Ser Trp Leu Pro Leu 


Leu Leu Met Ala Ser Lys Gly Gin 

10 15 
Leu Cys Gly His Arg Pro Ala Phe 
25 30 
Arg Glu Leu Leu Glu Val Gin His 
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intra 


35 40 45 

Gly Glu Phe Pro Trp Gin Val Ser lie Gin Met Leu Gly Lys His Leu 

50 55 60 

Cys Gly Gly Ser He He His Arg Trp Trp Val Leu Thr Ala Ala His 
65 70 75 80 

Cys Phe Pro Arg Thr Leu Leu Glu Leu Val Ala Val Asn Val Thr Val 

85 90 95 

Val Met Gly He Lys Thr Phe Ser Asp Thr Asn Leu Glu Arg Lys Gin 

100 105 110 

Val Gin Lys He He Ala His Arg Asp Tyr Lys Pro Pro Asp Leu Asp 

115 120 125 

Ser Asp Leu Cys Leu Leu Leu Leu Ala Thr Pro He Gin Phe Asn Lys 

130 ' 135 140 

Asp Lys Met Pro He Cys Leu Pro Gin Arg Glu Asn Ser Trp Asp Arg 
145 150 155 160 

Cys Trp Met Ser Glu Trp Ala Tyr Thr His Gly His Gly Ser Ala Lys 

165 170 175 

Gly Ser Asn Met His Leu Lys Lys Leu Arg Val Val Gin He Ser Trp 

180 185 190 

Arg Thr Cys Ala Lys Arg Val Thr Gin Leu Ser Arg Asn Met Leu Cys 

195 200 205 

Ala Trp Lys Glu Val Gly Thr Asn Gly Lys Cys Gin Gly Asp Ser Gly 

210 215 220 

Ala Pro Met Val Cys Ala Asn Trp Glu Thr Arg Arg Leu Phe Gin Val 
225 230 235 240 

Gly Val Phe Ser Trp Gly He Thr Ser Gly Ser Arg Gly Arg Pro Gly 

245 250 255 

He Phe Val Ser Val Ala Gin Phe He Pro Trp He Leu Glu Glu Thr 

260 265 270 

Gin Arg Glu Gly Arg Ala Leu Ala Leu Ser Lys Ala Ser Lys Ser Leu 

275 280 285 

Leu Ala Gly Ser Pro Arg Tyr His Pro He Leu Leu Ser Met Gly Ser 
290 295 -------- 300 

Gin He Leu Leu Ala Ala He Phe Ser Asp Asp Lys Ser Asn Cys 
305 310 315 


<210> 5 

<211> 1035 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (73) . . . (867) 

<221> misc_f eature 

<222> 1032 

<223> y=C or T/U 

<221> misc_f eature 
<222> 1033 
<223> R=A or G 

<221> misc feature 
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<222> 1035 

<223> W=A or T/U 


<400> 5 

ctgtggctgg catgttgtca gctctggctg gaggcaaagg tttggcaatt ttggactgga 60 
attgacaaga ag atg ttc cag ctt eta att ccc ctg ctt ttg gca etc aag 111 
Met Phe Gin Leu Leu lie Pro Leu Leu Leu Ala Leu Lys 
15 10 

gga cat gec cag gac aat cca gaa aac gta caa tgt ggc cac agg cct 159 
Gly His Ala Gin Asp Asn Pro Glu Asn Val Gin Cys Gly His Arg Pro 
15 20 25 ' 

get ttt cca aac teg tea tgg tta cca ttt cat gaa egg ctt caa gtc 207 
Ala Phe Pro Asn Ser Ser Trp Leu Pro Phe His Glu Arg Leu Gin Val 
30 35 40 45 

cag aat ggt gag tgc ccg tgg caa gtg agt ate cag atg tea egg aaa 255 
Gin Asn Gly Glu Cys Pro Trp Gin Val Ser lie Gin Met Ser Arg Lys 
50 55 60 

cac etc tgt gga ggc tea ate tta cat tgg tgg tgg gtt ctg aca gee 303 
His Leu Cys Gly Gly Ser lie Leu His Trp Trp Trp Val Leu Thr Ala 
65 70 75 

gca cac tgc ttc cga aga acc eta tta gac atg gec gtg gta aat gtc 351 
Ala His Cys Phe Arg Arg Thr Leu Leu Asp Met Ala Val Val Asn Val 
80 85 90 

act gtg gtc atg gga acg aga aca ttc age aac ate cac teg gag aga 399 
Thr Val Val Met Gly Thr Arg Thr Phe Ser Asn lie His Ser Glu Arg 
95 100 105 

aag caa gtg cag aag gtc att att cac aaa gat tac aaa ccg ccc cag 447 
Lys Gin Val Gin Lys Val lie lie His Lys Asp Tyr Lys Pro Pro Gin 
110 115 120 125 

etc gac agt gac etc tct ctg ctt eta ctt gee aca cca gtg caa ttc 495 
Leu Asp Ser Asp Leu Ser Leu Leu Leu Leu Ala Thr Pro Val Gin Phe 
130 135 140 

age aat ttc aaa atg cct gtc tgc ctg cag gag gag gag agg acc tgg 543 
Ser Asn Phe Lys Met Pro Val Cys Leu Gin Glu Glu Glu Arg Thr Trp 
145 150 155 

gac tgg tgt tgg atg gca cag tgg gta acg acc aat ggg tat gac caa 591 
Asp Trp Cys Trp Met Ala Gin Trp Val Thr Thr Asn Gly Tyr Asp Gin 
160 165 170 

tat gat gac tta aac atg cac ctg gaa aag ctg aga gtg gtg cag att 639 
Tyr Asp Asp Leu Asn Met His Leu Glu Lys Leu Arg Val Val Gin lie 
175 180 185 


age egg aaa gaa tgt gee aag agg gta aac cag ctg tec agg aac atg 

Ser Arg Lys Glu Cys Ala Lys Arg Val Asn Gin Leu Ser Arg Asn Met 


687 
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190 195 200 205 

att tgt get teg aac gaa cca ggc acc aat ggt ate ttc aag gga gac 735 
lie Cys Ala Ser Asn Glu Pro Gly Thr Asn Gly lie Phe Lys Gly Asp 
210 215 220 

agt ggg gca cct ctg gtt tgt get att tat gga acc cag aga etc ttc 783 
Ser Gly Ala Pro Leu Val Cys Ala lie Tyr Gly Thr Gin Arg Leu Phe 
225 230 235 

caa gtg ggt gtc ttc agt ggg ggc ata aga tct ggc tec agg ggg aga 831 
Gin Val Gly Val Phe Ser Gly Gly lie Arg Ser Gly Ser Arg Gly Arg 
240 245 250 

cct ggt atg ttt gtg tct gtg get caa ttt att cca tgaagecagg 877 
Pro Gly Met Phe Val Ser Val Ala Gin Phe lie Pro 
255 260 265 

aggagacaga aaaggagggg aaagectaca ccataatctc aggatccacg agaagecgag 93 7 
aagctcactg gtgtgtgttc ctcagtaccc ettcttgeta ggattggggt etcaaatget 997 
gctggccacc atgtttaccg gtgataaacc taacyrcw 1035 

<210> 6 
<211> 265 
<212> PRT 

<213> Homo sapiens 
<400> 6 

Met Phe Gin Leu Leu lie Pro Leu Leu Leu Ala Leu Lys Gly His Ala 

15 10 15 

Gin Asp Asn Pro Glu Asn Val Gin Cys Gly His Arg Pro Ala Phe Pro 

20 25 30 

Asn Ser Scr Trp Leu Pro Phe His Glu Arg Leu Gin Val Gin Ann Gly 

35 40 45 

Glu Cys Pro Trp Gin Val Ser lie Gin Met Ser Arg Lys His Leu Cys 

50 55 60 

Gly Gly Ser lie Leu His Trp Trp Trp Val Leu Thr Ala Ala His Cys 
65 70 75 80 

Phe Arg Arg Thr Leu Leu Asp Met Ala Val Val Asn Val Thr Val Val 

85 90 95 

Met Gly Thr Arg Thr Phe Ser Asn lie His Ser Glu Arg Lys Gin Val 

100 105 110 

Gin Lys Val lie lie His Lys Asp Tyr Lys Pro Pro Gin Leu Asp Ser 

115 120 125 

Asp Leu Ser Leu Leu Leu Leu Ala Thr Pro Val Gin Phe Ser Asn Phe 

130 135 140 

Lys Met Pro Val Cys Leu Gin Glu Glu Glu Arg Thr Trp Asp Trp Cys 
145 150 155 160 

Trp Met Ala Gin Trp Val Thr Thr Asn Gly Tyr Asp Gin Tyr Asp Asp 

165 170 175 

Leu Asn Met His Leu Glu Lys Leu Arg Val Val Gin lie Ser Arg Lys 

180 185 190 

Glu Cys Ala Lys Arg Val Asn Gin Leu Ser Arg Asn Met lie Cys Ala 

195 200 205 

Ser Asn Glu Pro Gly Thr Asn Gly lie Phe Lys Gly Asp Ser Gly Ala 
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210 215 220 

Pro Leu Val Cys Ala lie Tyr Gly Thr Gin Arg Leu Phe Gin Val Gly 
225 230 235 240 

Val Phe Ser Gly Gly lie Arg Ser Gly Ser Arg Gly Arg Pro Gly Met 

245 250 255 

Phe Val Ser Val Ala Gin Phe lie Pro 
260 265 


<210> 7 

<211> 1028 

<212> DNA 

<213> Mus musculus 

<220> 
<221> CDS 

<222> (38) ... (1000) 
<400> 7 

gtcagcctgg cctccaacac acagcacagc cagagcc atg ate ctg ccc tec ate 55 

Met lie Leu Pro Ser lie 


ctg eta ctt gtt gee cac acc ctg gaa gca aat gtt gag tgt ggt gtg 103 

Leu Leu Leu Val Ala His Thr Leu Glu Ala Asn Val Glu Cys Gly Val 

10 15 20 

aga ccc ctg tat gat age aga att caa tac tec agg ate ata gaa ggg 151 

Arg Pro Leu Tyr Asp Ser Arg lie Gin Tyr Ser Arg He He Glu Gly 
25 30 35 

cag gag get gag ctg ggt gag ttt cca tgg cag gtg age att cag gaa 199 

Gin "Glu Ala Glu Leu Gly Glu Phe Pro Trp Gin Val Ser lie Glh Glu 
40 45 50 

agt gac cac cat ttc tgc ggc ggc tec att etc agt gag tgg tgg ate 247 

Ser Asp His His Phe Cys Gly Gly Ser He Leu Ser Glu Trp Trp He 

55 60 65 70 

etc acc gtg gec cac tgc ttc tat get cag gag ctt tec cca aca gat 2 95 

Leu Thr Val Ala His Cys Phe Tyr Ala Gin Glu Leu Ser Pro Thr Asp 

75 80 85 

etc aga gtc aga gtg gga acc aat gac tta act act tea ccc gtg gaa 343 

Leu Arg Val Arg Val Gly Thr Asn Asp Leu Thr Thr Ser Pro Val Glu 

90 95 100 

eta gag gtc acc acc ata ate egg cac aaa ggc ttt aaa egg ctg aac 3 91 

Leu Glu Val Thr Thr He He Arg His Lys Gly Phe Lys Arg Leu Asn 
105 110 115 

atg gac aac gac att gee ttg ttg ctg eta gec aag ccc ttg gcg ttc 439 

Met Asp Asn Asp He Ala Leu Leu Leu Leu Ala Lys Pro Leu Ala Phe 
120 125 130 
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aat gag ctg acg gtg ccc ate tgc ctt cct etc tgg ccc gec cct ccc 
Asn Glu Leu Thr Val Pro lie Cys Leu Pro Leu Trp Pro Ala Pro Pro 
135 140 145 150 


age tgg cac gaa tgc tgg gtg gca gga tgg ggc gta acc aac tea act 
Ser Trp His Glu Cys Trp Val Ala Gly Trp Gly Val Thr Asn Ser Thr 
155 160 165 

gac aag gaa tct atg tea acg gat ctg atg aag gtg ccc atg cgt ate 
Asp Lys Glu Ser Met Ser Thr Asp Leu Met Lys Val Pro Met Arg lie 
170 175 180 

ata gag tgg gag gaa tgc tta cag atg ttt ccc age etc acc aca aac 
lie Glu Trp Glu Glu Cys Leu Gin Met Phe Pro Ser Leu Thr Thr Asn 
185 190 195 

atg ctg tgt gec tea tat ggt aat gag age tac gat get tgc cag ggt 
Met Leu Cys Ala Ser Tyr Gly Asn Glu Ser Tyr Asp Ala Cys Gin Gly 
200 205 210 

gac agt ggg gga ccg ctt gtc tgc acc aca gat cct ggc agt agg tgg 
Asp Ser Gly Gly Pro Leu Val Cys Thr Thr Asp Pro Gly Ser Arg Trp 
215 220 225 230 

tac cag gtg ggc ate ate age tgg ggc aag age tgt gga aaa aaa ggc 
Tyr Gin Val Gly lie lie Ser Trp Gly Lys Ser Cys Gly Lys Lys Gly 
235 240 245 

ttc cca ggg ata tat act gta ttg gca aag tat acc ctg tgg att gag 
Phe Pro Gly lie Tyr Thr Val Leu Ala Lys Tyr Thr Leu Trp lie Glu 
250 255 260 

aaa ata gec cag aca "gag ggg aag ccc ctg gat ttt aga ggt cag age 
Lys lie Ala Gin Thr Glu Gly Lys Pro Leu Asp Phe Arg Gly Gin Ser 
265 270 275 

tec tct aac aag aag aaa aac aga cag aac aat cag etc tec aaa tec 
Ser Ser Asn Lys Lys Lys Asn Arg Gin Asn Asn Gin Leu Ser Lys Ser 
280 285 290 

cca gee ctg aac tgc ccc caa age tgg etc ctg ccc tgt ctg ctg tec 
Pro Ala Leu Asn Cys Pro Gin Ser Trp Leu Leu Pro Cys Leu Leu Ser 
295 300 305 310 

ttt gca ctg ctt aga gec ttg tec aac tgg aaa taaaacaatg cagtctctga 
Phe Ala Leu Leu Arg Ala Leu Ser Asn Trp Lys 
315 320 

tccaccct 

<210> 8 
<211> 321 
<212> PRT 

<213> Mus musculus 
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<400> 8 

Met lie Leu Pro 
1 

Asn Val Glu Cys 
20 

Ser Arg lie lie 
35 

Gin Val Ser lie 
50 

Leu Ser Glu Trp 
65 

Glu Leu Ser Pro 

Thr Thr Ser Pro 
100 

Gly Phe Lys Arg 
115 

Ala Lys Pro Leu 
130 

Leu Trp Pro Ala 
145 

Gly Val Thr Asn 

Lys Val Pro Met 
180 

Pro Ser Leu Thr 
195 

Tyr Asp Ala Cys 
210 

Asp Pro Gly Ser 
225 

Ser Cys Gly Lys 

Tyr Thr Leu Trp 
260 

Asp Phe Arg Gly 
275 

Asn Gin Leu Ser 
290 

Leu Pro Cys Leu 

305 

Lys 


Ser lie Leu Leu 
5 

Gly Val Arg Pro 

Glu Gly Gin Glu 
40 

Gin Glu Ser Asp 
55 

Trp lie Leu Thr 
70 

Thr Asp Leu Arg 
85 

Val Glu Leu Glu 

Leu Asn Met Asp 
120 

Ala Phe Asn Glu 
135 

Pro Pro Ser Trp 
150 

Ser Thr Asp Lys 
165 

Arg He He Glu 

Thr Asn Met Leu 
200 

Gin Gly Asp Ser 
215 

Arg Trp Tyr Gin 
230 

Lys Gly Phe Pro 
245 

He Glu Lys He 

Gin Ser Ser Ser 
280 

Lys Ser Pro Ala 
295 

Leu Ser Phe Ala 
310 


Leu Val Ala His 
10 

Leu Tyr Asp Ser 
25 

Ala Glu Leu Gly 

His His Phe Cys 
60 

Val Ala His Cys 
75 

Val Arg Val Gly 
90 

Val Thr Thr He 
105 

Asn Asp He Ala 

Leu Thr Val Pro 
14 0 

His Glu Cys Trp 
155 

Glu Ser Met Ser 
170 

Trp Glu Glu Cys 
185 

Cys Ala Ser Tyr 

Gly Gly Pro Leu 
220 

Val Gly He He 
235 

Gly He Tyr Thr 
250 

Ala Gin Thr Glu 
265 

Asn Lys Lys Lys 

Leu Asn Cys Pro 
300 

Leu Leu Arg Ala 
315 


Thr Leu Glu Ala 
15 

Arg He Gin Tyr 
30 

Glu Phe Pro Trp 
45 

Gly Gly Ser He 

Phe Tyr Ala Gin 
80 

Thr Asn Asp Leu 
95 

He Arg His Lys 
110 

Leu Leu Leu Leu 
125 

He Cys Leu Pro 

Val Ala Gly Trp 
160 

Thr Asp Leu Met 
175 

Leu Gin Met Phe 
190 

Gly Asn Glu Ser 
205 

Val Cys Thr Thr 

Ser Trp Gly Lys 
240 

Val Leu Ala Lys 
255 

Gly Lys Pro Leu 
270 

Asn Arg Gin Asn 
285 

Gin Ser Trp Leu 

Leu Ser Asn Trp 
320 


<210> 9 

<211> 1123 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (41) . . . (1096) 

<400> 9 
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ggcctctgtc acccccgggc ccacagcaca gcccagggcc atg etc ctg ttc tea 55 

Met Leu Leu Phe Ser 


gtg ttg ctg etc ctg tec ctg gtc acg gga act cag etc ggt cca egg 103 
Val Leu Leu Leu Leu Ser Leu Val Thr Gly Thr Gin Leu Gly Pro Arg 
10 15 20 

act cct etc cca gag get gga gtg get ate eta ggc agg get agg gga 151 
Thr Pro Leu Pro Glu Ala Gly Val Ala lie Leu Gly Arg Ala Arg Gly 
25 30 35 

gee cac cgc cct cag ccc cgt cat ccc ccc age cca gtc agt gaa tgt 199 
Ala His Arg Pro Gin Pro Arg His Pro Pro Ser Pro Val Ser Glu Cys 
40 45 50 

ggt gac aga tct att ttc gag gga aga act egg tat tec aga ate aca 247 
Gly Asp Arg Ser lie Phe Glu Gly Arg Thr Arg Tyr Ser Arg lie Thr 
55 60 65 

ggg ggg atg gag gcg gag gtg ggt gag ttt ccg tgg cag gtg agt att 2 95 
Gly Gly Met Glu Ala Glu Val Gly Glu Phe Pro Trp Gin Val Ser lie 
70 75 80 85 

cag gca aga agt gaa cct ttc tgt ggc ggc tec ate etc aac aag tgg 343 
Gin Ala Arg Ser Glu Pro Phe Cys Gly Gly Ser lie Leu Asn Lys Trp 
90 95 100 

tgg att etc act gcg get cac tgc tta tat tec gag gag ctg ttt cca 391 
Trp lie Leu Thr Ala Ala His Cys Leu Tyr Ser Glu Glu Leu Phe Pro 
105 110 115 

gaa gaa ctg agt gtc gtg ctg ggg acc aac gac tta act age cca tec 439 
Glu Glu Leu Ser Val Val Leu Gly Thr Asn Asp Leu Thr Ser Pro Ser 
120 125 130 

atg gaa ata aag gag gtc gee age ate att ctt cac aaa gac ttt aag 487 
Met Glu lie Lys Glu Val Ala Ser lie lie Leu His Lys Asp Phe Lys 
135 140 145 

aga gec aac atg gac aat gac att gec ttg ctg ctg ctg get teg ccc 535 
Arg Ala Asn Met Asp Asn Asp lie Ala Leu Leu Leu Leu Ala Ser Pro 
150 155 160 165 

ate aag etc gat gac ctg aag gtg ccc ate tgc etc ccc acg cag ccc 583 
lie Lys Leu Asp Asp Leu Lys Val Pro lie Cys Leu Pro Thr Gin Pro 
170 175 180 

ggc cct gec aca tgg cgc gaa tgc tgg gtg gca ggt tgg ggc cag acc 631 
Gly Pro Ala Thr Trp Arg Glu Cys Trp Val Ala Gly Trp Gly Gin Thr 
185 190 195 

aat get get gac aaa aac tct gtg aaa acg gat ctg atg aaa gcg cca 679 
Asn Ala Ala Asp Lys Asn Ser Val Lys Thr Asp Leu Met Lys Ala Pro 
200 205 210 
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atg gtc ate atg gac tgg gag gag tgt tea aag atg ttt cca aaa ctt 727 

Met Val lie Met Asp Trp Glu Glu Cys Ser Lys Met Phe Pro Lys Leu 
215 220 225 

acc aaa aat atg ctg tgt gec gga tac aag aat gag age tat gat gec 775 

Thr Lys Asn Met Leu Cys Ala Gly Tyr Lys Asn Glu Ser Tyr Asp Ala 

230 235 240 245 

tgc aag ggt gac agt ggg ggg cct ctg gtc tgc acc cca gag cct ggt 82 3 

Cys Lys Gly Asp Ser Gly Gly Pro Leu Val Cys Thr Pro Glu Pro Gly 

250 255 2S0 

gag aag tgg tac cag gtg ggc ate ate age tgg gga aag age tgt gga 871 

Glu Lys Trp Tyr Gin Val Gly lie lie Ser Trp Gly Lys Ser Cys Gly 

265 270 275 


gat aag aac acc cca ggg ata tac acc teg ttg gtg aac tac aac etc 
Asp Lys Asn Thr Pro Gly lie Tyr Thr Ser Leu Val Asn Tyr Asn Leu 
280 285 290 


919 


tgg ate gag aaa gtg acc cag eta gga ggc agg ccc ttc aat gca gag 967 
Trp lie Glu Lys Val Thr Gin Leu Gly Gly Arg Pro Phe Asn Ala Glu 
295 300 305 

aaa agg agg act tct gtc aaa cag aaa cct atg ggc tec cca gtc teg 1015 
Lys Arg Arg Thr Ser Val Lys Gin Lys Pro Met Gly Ser Pro Val Ser 
310 315 320 325 


gga gtc cca gag cca ggc age ccc aga tec tgg etc ctg etc tgt ccc 
Gly Val Pro Glu Pro Gly Ser Pro Arg Ser Trp Leu Leu Leu Cys Pro 
330 335 340 


1063 


ctg tec cat gtg ttg ttc aga get att ttg tac tgataataaa atagaggcta 1116 
Leu Ser His Val Leu Phe Arg Ala lie Leu Tyr 
345 350 

ttctttc 1123 

<210> 10 
<211> 352 
<212> PRT 

<213> Homo sapiens 
<400> 10 

Met Leu Leu Phe Ser Val Leu Leu Leu Leu Ser Leu Val Thr Gly Thr 

15 10 15 

Gin Leu Gly Pro Arg Thr Pro Leu Pro Glu Ala Gly Val Ala lie Leu 

20 25 30 

Gly Arg Ala Arg Gly Ala His Arg Pro Gin Pro Arg His Pro Pro Ser 

35 40 45 

Pro Val Ser Glu Cys Gly Asp Arg Ser lie Phe Glu Gly Arg Thr Arg 

50 55 60 

Tyr Ser Arg lie Thr Gly Gly Met Glu Ala Glu Val Gly Glu Phe Pro 
65 70 75 80 


-13- 


Trp Gin Val Ser lie Gin Ala Arg Ser Glu Pro Phe Cys Gly Gly Ser 

85 90 95 

lie Leu Asn Lys Trp Trp lie Leu Thr Ala Ala His Cys Leu Tyr Ser 

100 105 110 

Glu Glu Leu Phe Pro Glu Glu Leu Ser Val Val Leu Gly Thr Asn Asp 

115 120 125 

Leu Thr Ser Pro Ser Met Glu lie Lys Glu Val Ala Ser lie lie Leu 

130 135 140 

His Lys Asp Phe Lys Arg Ala Asn Met Asp Asn Asp lie Ala Leu Leu 
145 150 155 160 

Leu Leu Ala Ser Pro lie Lys Leu Asp Asp Leu Lys Val Pro lie Cys 

165 170 175 

Leu Pro Thr Gin Pro Gly Pro Ala Thr Trp Arg Glu Cys Trp Val Ala 

180 185 190 

Gly Trp Gly Gin Thr Asn Ala Ala Asp Lys Asn Ser Val Lys Thr Asp 

195 200 205 

Leu Met Lys Ala Pro Met Val lie Met Asp Trp Glu Glu Cys Ser Lys 

210 215 220 

Met Phe Pro Lys Leu Thr Lys Asn Met Leu Cys Ala Gly Tyr Lys Asn 
225 230 235 240 

Glu Ser Tyr Asp Ala Cys Lys Gly Asp Ser Gly Gly Pro Leu Val Cys 

245 250 255 

Thr Pro Glu Pro Gly Glu Lys Trp Tyr Gin Val Gly lie lie Ser Trp 

260 265 270 

Gly Lys Ser Cys Gly Asp Lys Asn Thr Pro Gly lie Tyr Thr Ser Leu 

275 280 285 

Val Asn Tyr Asn Leu Trp lie Glu Lys Val Thr Gin Leu Gly Gly Arg 

290 295 300 

Pro Phe Asn Ala Glu Lys Arg Arg Thr Ser Val Lys Gin Lys Pro Met 
305 310 315 320 

Gly Ser Pro Val Ser Gly Val Pro Glu Pro Gly Ser Pro Arg Ser Trp 

325 330 335 

Leu Leu Leu Cys Pro Leu Ser His "Val" Leu Phe Arg Ala lie Leu Tyr 
340 345 350 


<210> 11 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "76A5sc2-B", an artificially synthesized primer 
sequence 

<400> 11 

gatcmacagg tgccagtcat ca 

<210> 12 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT SP6", an artificially synthesized primer 


-14- 
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sequence 
<400> 12 

atttaggtga cactatagaa 20 

<210> 13 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT Fw", an artificially synthesized primer 
sequence 

<400> 13 

tgtaaaacga cggccagt 18 

<210> 14 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "Sport RV", an artificially synthesized primer 
sequence 

<400> 14 

caggaaacag ctatgacc 18 

<210> 15 
<211> 22 
<212> DNA 

<213> Artif i cial Sequence 
<220> 

<223> "No9-C", an artificially synthesized primer 
sequence 

<400> 15 

atgcttctgc tatcgtggaa gg 22 

<210> 16 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT T7", an artificially synthesized primer 
sequence 

<400> 16 

taatacgact cactataggg 20 

<210> 17 
<211> 22 


-15- 


<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-B", an artificially synthesized primer 
sequence 

<400> 17 

ctttgtgctg aggtcttcag tg ' 

<210> 18 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-G", an artificially synthesized primer 
sequence 

<400> 18 

cagtcaatgt cactgtggtc at 

<210> 19 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-J", an artificially synthesized primer 
sequence 

<400> 19 

acttgccgtt ggtgcccact tc 

<210> 20 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-P", an artificially synthesized primer 
sequence 

<400> 20 

gcactggaat gacaacatga tgc 

<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-Q", an artificially synthesized primer 
sequence 
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<400> 21 

attggcgtgg caagtaggag ca 

<210> 22 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-N", an artificially synthesized primer 
sequence 

<400> 22 

cgagtctccc agttagcaca ga 

<210> 23 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-M", an artificially synthesized primer 
sequence 

<400> 23 

cggtgacttg gtcatgtctg tg 

<210> 24 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-K", an artificially synthesized primer 
sequence 

<400> 24 

ggatccatga aacgatggaa ggacagaag 

<210> 25 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-0", an artificially synthesized primer 
sequence 

<400> 25 

cgcagagttc tgctcataca ta 

<210> 26 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> "No9-A", an artificially synthesized primer 
sequence 

<400> 26 

ggcatgtagc tcactggcat g 

<210> 27 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "29 (-)", an artificially synthesized primer 
sequence 

<400> 27 

ggaccagcaa gaatcagttc tg 

<210> 28 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "17 (+) 95 (+)", an artificially synthesized 
primer sequence 

<400> 28 

ctgctaccag ttctaatttg cc 

<210> 29 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "G3PDH 5' ", an artificially synthesized primer 
sequence 

<400> 29 

gagattgttg ccatcaacga cc 

<210> 30 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "G3PDH 3' ", an artificially synthesized primer 
sequence 

<400> 30 

gttgaagtcg caggagacaa cc 
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<210> 31 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-B", an artificially synthesized primer 
sequence 

<400> 31 

agaggtcact gtcgagctgg g 

<210> 32 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-D", an artificially synthesized primer 
sequence 

<400> 32 

tgtgaataat gaccttctgc ac 

<210> 33 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-A", an artificially synthesized primer 
sequence 

<400> 33 

ttcagcaaca tccactcgga ga 

<210> 34 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-C", an artificially synthesized primer 
sequence 

<400> 34 

aagcaagtgc agaaggtcat ta 

<210> 35 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-F", an artificially synthesized primer 
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sequence 


<400> 35 

cattggtcgt tacccactgt gc 

<210> 36 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "PROl-E", an artificially synthesized primer 
sequence 

<400> 36 

attctcaatg agtggtgggt tct 

<210> 37 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "PROl-D", an artificially synthesized primer 
sequence 

<400> 37 

ccagcacaca gcatattctt gg 

<210> 38 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-B", an artificially synthesized primer 
sequence 

<400> 38 

ggaaacagct cctcggaata taagc 

<210> 39 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-D", an artificially synthesized primer 
sequence 

<400> 39 

tggatgggct agttaagtcg ttggt 

<210> 40 
<211> 23 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-A", an artificially synthesized primer 
sequence 

<400> 40 

ttcgagggaa gaactcggta ttc 23 

<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-C", an artificially synthesized primer 
sequence 

<400> 41 

tgtgaaaacg gatctgatga aagcg 2 5 

<210> 42 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-B", an artificially synthesized primer 
sequence 

<400> 42 

cacctactgc caggatctgt gg 22 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-D", an artificially synthesized primer 
sequence 

<400> 43 

ggctattttc tcaatccaca gggta 25 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-A", an artificially synthesized primer 
sequence 
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<400> 44 

atagagtggg aggaatgctt acaga 

<210> 45 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-C", an artificially synthesized primer 
sequence 

<400> 45 

gctacgatgc ttgccagggt g 


-22- 



Rec'd PC77PTO 3 o AUG 200) 


SEQUENCE LISTING 

<110> Chiaki Senoo 
Mariko Numata 

<120> Novel Trypsin Family Serine Proteases 


<130> 50026/027001 

<140> US 09/831, 180 
<141> 2001-05-03 

<150> PCT/JP99/06111 
<151> 1999-11-02 

<150> JP 1998-313366 
<151> 1998-11-04 

<160> 53 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1033 
<212> DNA 

<213> Mus musculus 

<220> 
<221> CDS 

<222> (48) . . . (1010) 
<400> 1 

cctgcctcag tgttggagct ccccattgct gatgtgcagg caagccg atg aaa cga 

Met Lys Arg 
1 

tgg aag gac aga aga aca ggc ctg ttg ctg cca ttg gtc etc ctg ttg 
Trp Lys Asp Arg Arg Thr Gly Leu Leu Leu Pro Leu Val Leu Leu Leu 
5 10 15 

ttt ggg gca tgt age tea ctg gca tgg gta tgt ggc egg cga atg agt 
Phe Gly Ala Cys Ser Ser Leu Ala Trp Val Cys Gly Arg Arg Met Ser 
20 25 30 35 

age aga tec caa caa ctt aac aat get tct get ate gtg gaa ggc aaa 
Ser Arg Ser Gin Gin Leu Asn Asn Ala Ser Ala lie Val Glu Gly Lys 
40 45 50 

cct get tct get ate gtg gga ggc aaa cct gca aac ate ttg gag ttc 
Pro Ala Ser Ala lie Val Gly Gly Lys Pro Ala Asn lie Leu Glu Phe 
55 60 65 

ccc tgg cat gtg ggg att atg aat cat ggt agt cat etc tgt ggg gga 
•Pro Trp His Val Gly lie Met Asn His Gly Ser His Leu Cys Gly Gly 
70 75 80 

tct att etc aat gag tgg tgg gtt eta tct gca tec cat tgc ttc gac 


104 


152 


200 


248 


296 
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Ser lie Leu Asn Glu Trp Trp Val Leu Ser Ala Ser His Cys Phe Asp 
85 90 95 

caa eta aac aac tct aaa ttg gag ate att cat ggc act gaa gac etc 392 
Gin Leu Asn Asn Ser Lys Leu Glu lie lie His Gly Thr Glu Asp Leu 
100 105 110 115 

age aca aag ggc ata aag tat cag aaa gtg gac aag tta ttc ttg cac 440 
Ser Thr Lys Gly lie Lys Tyr Gin Lys Val Asp Lys Leu Phe Leu His 
120 125 130 

cca aag ttt gat gac tgg etc ctg gac aac gac ata get ttg etc ttg 488 
Pro Lys Phe Asp Asp Trp Leu Leu Asp Asn Asp lie Ala Leu Leu Leu 
135 140 145 

etc aaa tec cca tta aac ttg agt gtc aac agg ata cct ate tgc act 536 
Leu Lys Ser Pro Leu Asn Leu Ser Val Asn Arg lie Pro lie Cys Thr 
150 155 160 

tea gaa ate tct gac ata cag gca tgg agg aac tgc tgg gtg aca gga 584 
Ser Glu lie Ser Asp lie Gin Ala Trp Arg Asn Cys Trp Val Thr Gly 
165 170 175 

tgg ggc att act aat act agt gaa aaa gga gtc caa ccc aca att ctg 632 
Trp Gly lie Thr Asn Thr Ser Glu Lys Gly Val Gin Pro Thr lie Leu 
180 185 190 195 

cag gca gtc aaa gtg gat ctg tac aga tgg gat tgg tgt ggc tat att 680 
Gin Ala Val Lys Val Asp Leu Tyr Arg Trp Asp Trp Cys Gly Tyr lie 
200 205 210 

ttg tct eta tta acc aag aat atg ctg tgt get ggg act caa gat cct 728 
Leu Ser Leu Leu Thr Lys Asn Met Leu Cys Ala Gly Thr Gin Asp Pro 
215 220 225 

ggg aag gat gee tgc cag ggc gac agt gga gga get etc gtt tgc aac 77 6 
Gly Lys Asp Ala Cys Gin Gly Asp Ser Gly Gly Ala Leu Val Cys Asn 
230 235 240 

aaa aag aga aac aca gee att tgg tac cag gtg ggc att gtc age tgg 824 
Lys Lys Arg Asn Thr Ala lie Trp Tyr Gin Val Gly lie Val Ser Trp 
245 250 255 

ggc atg ggc tgt ggc aag aag aat ctg cca gga gta tac acc aag gtg 872 
Gly Met Gly Cys Gly Lys Lys Asn Leu Pro Gly Val Tyr Thr Lys Val 
260 265 270 275 

tea cac tat gtg agg tgg ate age aag cag aca gcg aag gcg ggg agg 92 0 
Ser His Tyr Val Arg Trp lie Ser Lys Gin Thr Ala Lys Ala Gly Arg 
280 285 290 

cct tat atg tat gag cag aac tct gcg tgc cct ttg gtg etc tct tgc 968 
Pro Tyr Met Tyr Glu Gin Asn Ser Ala Cys Pro Leu Val Leu Ser Cys 
295 300 305 

egg get ate ttg ttc eta tat ttt gta atg ttt ctt eta acc 1010 
Arg Ala lie Leu Phe Leu Tyr Phe Val Met Phe Leu Leu Thr 
310 315 320 
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tgatgattaa acgtgagact gcc 


<210> 2 

<211> 321 

<212> PRT 

<213> Mus musculus 


<400> 2 


1X1 tr L 

Lys 

Arg 

i xy 

Lys 

Asp 

Arg 

Arg 

-1 
X 








Leu. 

Leu. 

Leu 

rr lltr 




Ser 









Arg 

jxie l 



Arg 

ofcrX 

m n 

VjjJ.il 

Gin 








40 

bill 


Lys 

irX U 

A 1 ^ 

gel 

flip 
n La 

He 


D u 







Leu 

vj J. u 

X>Vi o 

T3 V" r\ 

ixp 

Hi c; 
n x o 

Val 

Gly 

U -J 





70 




w±y 


C y- 

He 
85 

Leu 

Asn 

Glu 

i^y o 

±r lit: 


Gin 

Leu 

Asn 

Asn 

Ser 




100 





pi n 

Asp 

Leu 

Cpv- 

Thr 

Lys 

Gly 

He 



IIS 





120 

XXlfc: 

Leu 

nio 

rl u 


Phe 

Asp 

Asp 


_L J U 





135 


Leu 

Leu. 

Leu 

Xj tr LL 

T Are 

±jy o 

Ot.1 

Pro 

Leu 

145 





150 



Tl <=> 
11c 

u-y i=» 

T> Vi v" 
1 111. 

C q -v- 
O tri. 

Glu 

He 

Ser 

Asp 





165 




V dl 

1 I1X 

m A7 
oiy 

1 rp 


lie 

Thr 

Asn 




180 





Thr 

He 

Leu 

Gin 

Ala 

Val 

Lys 

Val 



195 





200 

Gly 

Tyr 

lie 

Leu 

Ser 

Leu 

Leu 

Thr 


210 





215 


Gin 

Asp 

Pro 

Gly 

Lys 

Asp 

Ala 

Cys 

225 





230 



Val 

Cys 

Asn 

Lys 

Lys 

Arg 

Asn 

Thr 





245 




Val 

Ser 

Trp 

Gly 

Met 

Gly 

Cys 

Gly 




260 





Thr 

Lys 

Val 

Ser 

His 

Tyr 

Val 

Arg 



275 





280 

Ala 

Gly 

Arg 

Pro 

Tyr 

Met 

Tyr 

Glu 


290 





295 


Leu 

Ser 

Cys 

Arg 

Ala 

He 

Leu 

Phe 

305 





310 




Thr 


Thr 

Gly 

Leu 

Leu 

Leu 

Pro 

Leu 

Val 


10 





15 


Ser 

Leu 

Ala 

Trp 

Val 

Cys 

Gly 

Arg 

25 





30 



Leu 

Asn 

Asn 

Ala 

Ser 

Ala 

He 

Val 





45 




Val 

Gly 

Gly 

Lys 

Pro 

Ala 

Asn 

lie 




60 





He 

Met 

Asn 

His 

Gly 

Ser 

His 

Leu 



75 





80 

TrD 

Trp 

Val 

Leu 

Ser 

Ala 

Ser 

His 


90 





95 


Lys 

Leu 

Glu 

He 

He 

His 

Gly 

Thr 

105 





110 



Lvs 

Tyr 

Gin 

Lys 

Val 

Asp 

Lys 

Leu 





125 




Trp 

Leu 

Leu 

Asp 

Asn 

Asp 

He 

Ala 




140 





Asn 

Leu 

Ser 

Val 

Asn 

Arg 

He 

Pro 



155 





160 

He 

Gin 

Ala 

Trp 

Arg 

Asn 

Cys 

Trp 


170 





175 


Thr 

Ser 

Glu 

Lys 

Gly 

Val 

Gin 

Pro 

185 





190 



Asp 

Leu 

Tyr 

Arg 

Trp 

Asp 

Trp 

Cys 





205 




Lys- 

Asn 

Met 

Leu 

Cys 

Ala 

Gly 

Thr 




220 





Gln 

Gly Asp 

Ser 

Gly 

Gly 

Ala 

Leu 



235 





240 

Ala 

He 

Trp 

Tyr 

Gin 

Val 

Gly 

He 


250 





255 


Lys 

Lys 

Asn 

Leu 

Pro 

Gly 

Val 

Tyr 

265 





270 



Trp 

He 

Ser 

Lys 

Gin 

Thr 

Ala 

Lys 





285 




Gin 

Asn 

Ser 

Ala 

Cys 

Pro 

Leu 

Val 




300 





Leu 

Tyr 

Phe 

Val 

Met 

Phe 

Leu 

Leu 



315 





320 


<210> 3 

<211> 1034 

<212> DNA 

<213> Mus musculus 


<220> 
<221> CDS 


<222> (69) . . . (1025) 
<223> 


<221> misc_feature 
<222> 10 

<223> n = A or C or G or T/U 


<400> 3 

cccacgcgtn cggttgtatc aatgtgggca gggcatcaag gcaggcacca ctgcactgga 6 0 
atgacaac atg atg etc cca ctt eta att gca ctg etc atg get tec aag 110 
Met Met Leu Pro Leu Leu lie Ala Leu Leu Met Ala Ser Lys 
15 10 


gga caa get aag gac cag caa gaa 
Gly Gin Ala Lys Asp Gin Gin Glu 
15 20 

gec ttc cca aac tea tea tgg ctg 
Ala Phe Pro Asn Ser Ser Trp Leu 
35 


tea gtt ctg tgt ggc cac aga cct 158 

Ser Val Leu Cys Gly His Arg Pro 
25 30 

cca ttg egg gag ctg ctt gag gtc 2 06 

Pro Leu Arg Glu Leu Leu Glu Val 
40 45 


cag cat ggt gag ttc cca tgg caa gtg agt ate cag atg ctt ggg aaa 254 
Gin His Gly Glu Phe Pro Trp Gin Val Ser lie Gin Met Leu Gly Lys 
50 55 60 


cac ctg tgt gga ggc tec ate ate cac egg tgg tgg gtt ctg aca gca 
His Leu Cys Gly Gly Ser lie He His Arg Trp Trp Val Leu Thr Ala 
65 70 75 


302 


gca cac tgc ttc ccg aga acc eta tta gaa ctg gta gca gtc aat gtc 350 
Ala His Cys Phe Pro Arg Thr Leu Leu Glu Leu Val Ala Val Asn Val 
80 85 90 


act gtg gtc atg gga ate aag act ttc agt gac acc aac tta gag aga 398 
Thr Val Val Met Gly He Lys Thr Phe Ser Asp Thr -Asn Leu Glu Arg 
95 100 105 110 

aaa caa gtg cag aag ate att get cac aga gac tac aaa ccg ccc gac 446 
Lys Gin Val Gin Lys He He Ala His Arg Asp Tyr Lys Pro Pro Asp 
115 120 125 


ctt gac age gac etc tgc ctg etc eta ctt gee acg cca ate caa ttc 
Leu Asp Ser Asp Leu Cys Leu Leu Leu Leu Ala Thr Pro He Gin Phe 
130 135 140 


494 


aat aaa gac aaa atg ccc ate tgc ctg cca cag agg gag aac tec tgg 542 
Asn Lys Asp Lys Met Pro He Cys Leu Pro Gin Arg Glu Asn Ser Trp 
145 150 155 

gac egg tgc tgg atg tea gag tgg gca tat act cat ggc cat ggt tea 590 
Asp Arg Cys Trp Met Ser Glu Trp Ala Tyr Thr His Gly His Gly Ser 
160 165 170 

gee aaa ggc tea aac atg cac ctg aag aag etc agg gtg gtt cag att 638 
Ala Lys Gly Ser Asn Met His Leu Lys Lys Leu Arg Val Val Gin He 
175 180 185 190 

age tgg agg aca tgt gcg aag agg gtg act cag etc tec agg aac atg 686 
Ser Trp Arg Thr Cys Ala Lys Arg Val Thr Gin Leu Ser Arg Asn Met 


-4- 


195 200 205 

ctt tgt get tgg aag gaa gtg ggc acc aac ggc aag tgc cag gga gac 734 
Leu Cys Ala Trp Lys Glu Val Gly Thr Asn Gly Lys Cys Gin Gly Asp 
210 215 220 

age ggg gca ccc atg gtc tgt get aac tgg gag act egg aga etc ttt 782 
Ser Gly Ala Pro Met Val Cys Ala Asn Trp Glu Thr Arg Arg Leu Phe 
225 230 235 

caa gtg ggt gtc ttc age tgg ggc ata act tea gga tec agg ggg agg 830 
Gin Val Gly Val Phe Ser Trp Gly lie Thr Ser Gly Ser Arg Gly Arg 
240 245 250 

cca ggc att ttt gtg tct gtg get cag ttt ate cca tgg ate ctg gag 878 
Pro Gly lie Phe Val Ser Val Ala Gin Phe lie Pro Trp lie Leu Glu 
255 260 265 270 

gag aca caa agg gag gga cga gee ctt gec etc tea aag gee tea aaa 926 
Glu Thr Gin Arg Glu Gly Arg Ala Leu Ala Leu Ser Lys Ala Ser Lys 
275 280 285 

agt etc ttg get ggc agt cca cgc tac cat ccc ata ttg eta age atg 974 
Ser Leu Leu Ala Gly Ser Pro Arg Tyr His Pro lie Leu Leu Ser Met 
290 295 300 

ggc tct caa ata ctg ctt get gec ata ttt tct gat gat aaa tea aat 1022 
Gly Ser Gin lie Leu Leu Ala Ala lie Phe Ser Asp Asp Lys Ser Asn 
305 310 315 

tgc taagctctg 1034 
Cys 


<210> 4 
<211> 319 
<212> PRT 

<213> Mus musculus 
<400> 4 

Met Met Leu Pro Leu Leu lie Ala Leu Leu Met Ala Ser Lys Gly Gin 

15 10 15 

Ala Lys Asp Gin Gin Glu Ser Val Leu Cys Gly His Arg Pro Ala Phe 

20 25 30 

Pro Asn Ser Ser Trp Leu Pro Leu Arg Glu Leu Leu Glu Val Gin His 

35 40 45 

Gly Glu Phe Pro Trp Gin Val Ser lie Gin Met Leu Gly Lys His Leu 

50 55 60 

Cys Gly Gly Ser lie lie His Arg Trp Trp Val Leu Thr Ala Ala His 
65 70 75 80 

Cys Phe Pro Arg Thr Leu Leu Glu Leu Val Ala Val Asn Val Thr Val 

85 90 95 

Val Met Gly lie Lys Thr Phe Ser Asp Thr Asn Leu Glu Arg Lys Gin 

100 105 110 

Val Gin Lys lie lie Ala His Arg Asp Tyr Lys Pro Pro Asp Leu Asp 

115 120 125 

Ser Asp Leu Cys Leu Leu Leu Leu Ala Thr Pro lie Gin Phe Asn Lys 
130 135 140 
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Ly s 

Met 

Pro 

He 

Cys 

Leu 

Pro 

Gin 

Arg 

Glu 

Asn 

Ser 

Trp 

Asp 

Arg 

1 AS 

it J 





150 





155 





160 

i_y fa 

Trn 

1 L y 

Met 

Ser 

Glu 


Ala 

Tvr 

Thr 

His 

Gly 

His 

Gly 

Ser 

Ala 

Lys 




165 





170 





175 



■Z3 J. 


Met 

His 

Leu 

Ly s 

Ly s 

Leu 

Arg 

Val 

Val 

Gin 

He 

Ser 

Trp 



180 





185 





190 




Thr 

Cys 

Ala 

Ly s 

Arg 

Val 

Thr 

Gin 

Leu 

Ser 

Arg 

Asn 

Met 

Leu 

Cys 



195 





200 





205 




Ala 

TrD 

Ly s 

Glu 

Val 

Glv 

Thr 

Asn 

Gly 

Lys 

Cys 

Gin 

Gly 

Asp 

Ser 

Gly 


210 





215 





220 





Ala 

Pro 

Met 

Val 

Cvs 

Ala 

Asn 

Trp 

Glu 

Thr 

Arg 

Arg 

Leu 

Phe 

Gin 

Val 

225 





230 





235 





240 

Gly 

Val 

Phe 

Ser 

Trp 

Gly 

He 

Thr 

Ser 

Gly 

Ser 

Arg 

Gly 

Arg 

Pro 

Gly 




245 





250 





255 


He 

Phe 

Val 

Ser 

Val 

Ala 

Gin 

Phe 

He 

Pro 

Trp 

He 

Leu 

Glu 

Glu 

Thr 




260 





265 





270 



Gin 

Arg 

Glu 

Gly 

Arg 

Ala 

Leu 

Ala 

Leu 

Ser 

Lys 

Ala 

Ser 

Lys 

Ser 

Leu 



275 





280 





285 




Leu 

Ala 

Gly 

Ser 

Pro 

Arg 

Tyr 

His 

Pro 

He 

Leu 

Leu 

Ser 

Met 

Gly 

Ser 


290 





295 





300 





Gin 

He 

Leu 

Leu 

Ala 

Ala 

He 

Phe 

Ser 

Asp 

Asp 

Lys 

Ser 

Asn 

Cys 



305 310 315 


<210> 5 

<211> 1035 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (73) . . . (867) 

<221> misc_f eature 

<222> 1032 

<223> y=C or T/U 

<221> misc_f eature 
<222> 1033 
<223> R=A or G 


<221> mi sc_f eature 
<222> 1035 
<223> W=A or T/U 


<400> 5 

ctgtggctgg catgttgtca gctctggctg gaggcaaagg tttggcaatt ttggactgga 60 
attgacaaga ag atg ttc cag ctt eta att ccc ctg ctt ttg gca etc aag 111 
Met Phe Gin Leu Leu He Pro Leu Leu Leu Ala Leu Lys 
15 10 

gga cat gec cag gac aat cca gaa aac gta caa tgt ggc cac agg cct 159 
Gly His Ala Gin Asp Asn Pro Glu Asn Val Gin Cys Gly His Arg Pro 
15 20 25 

get ttt cca aac teg tea tgg tta cca ttt cat gaa egg ctt caa gtc 207 
Ala Phe Pro Asn Ser Ser Trp Leu Pro Phe His Glu Arg Leu Gin Val 
30 35 40 45 
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V'li C 


::1 iBO 


cag aat ggt gag tgc ccg tgg caa gtg agt ate cag atg tea egg aaa 
Gin Asn Gly Glu Cys Pro Trp Gin Val Ser lie Gin Met Ser Arg Lys 
50 55 60 


255 


cac etc tgt gga ggc tea ate tta cat tgg tgg tgg gtt ctg aca gec 
His Leu Cys Gly Gly Ser lie Leu His Trp Trp Trp Val Leu Thr Ala 
65 70 75 


303 


gca cac tgc ttc cga aga acc eta tta gac atg gee gtg gta aat gtc 
Ala His Cys Phe Arg Arg Thr Leu Leu Asp Met Ala Val Val Asn Val 
80 85 90 


351 


act gtg gtc atg gga acg aga aca ttc age aac ate cac teg gag aga 
Thr Val Val Met Gly Thr Arg Thr Phe Ser Asn lie His Ser Glu Arg 
95 100 105 


399 


aag caa gtg cag aag gtc att att cac aaa gat tac aaa ccg ccc cag 
Lys Gin Val Gin Lys Val lie lie His Lys Asp Tyr Lys Pro Pro Gin 
110 115 120 125 


447 


etc gac agt gac etc tct ctg ctt eta ctt gee aca cca gtg caa ttc 
Leu Asp Ser Asp Leu Ser Leu Leu Leu Leu Ala Thr Pro Val Gin Phe 
130 135 140 


495 


age aat ttc aaa atg cct gtc tgc ctg cag gag gag gag agg acc tgg 
Ser Asn Phe Lys Met Pro Val Cys Leu Gin Glu Glu Glu Arg Thr Trp 
145 150 155 


543 


gac tgg tgt tgg atg gca cag tgg gta acg acc aat ggg tat gac caa 
Asp Trp Cys Trp Met Ala Gin Trp Val Thr Thr Asn Gly Tyr Asp Gin 
160 165 170 


591 


tat gat gac tta aac atg cac ctg gaa aag ctg aga gtg gtg cag att 
Tyr Asp Asp Leu Asn Met His Leu Glu Lys Leu Arg Val Val Gin lie 

175 - - 180 - - 185 


639 


age egg aaa gaa tgt gee aag agg gta aac cag ctg tec agg aac atg 
Ser Arg Lys Glu Cys Ala Lys Arg Val Asn Gin Leu Ser Arg Asn Met 
190 195 200 205 


687 


att tgt get teg aac gaa cca ggc acc aat ggt ate ttc aag gga gac 
lie Cys Ala Ser Asn Glu Pro Gly Thr Asn Gly lie Phe Lys Gly Asp 
210 215 220 


735 


agt ggg gca cct ctg gtt tgt get att tat gga acc cag aga etc ttc 
Ser Gly Ala Pro Leu Val Cys Ala lie Tyr Gly Thr Gin Arg Leu Phe 
225 230 235 


783 


caa gtg ggt gtc ttc agt ggg ggc ata aga tct ggc tec agg ggg aga 
Gin Val Gly Val Phe Ser Gly Gly lie Arg Ser Gly Ser Arg Gly Arg 
240 245 250 


831 


cct ggt atg ttt gtg tct gtg get caa ttt att cca tgaagecagg 
Pro Gly Met Phe Val Ser Val Ala Gin Phe lie Pro 
255 260 265 


877 


aggagacaga aaaggagggg aaagectaca ccataatctc aggatccacg agaagecgag 937 
aagctcactg gtgtgtgttc ctcagtaccc ettcttgeta ggattggggt etcaaatget 997 
gctggccacc atgtttaccg gtgataaacc taacyrcw 1035 
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<210> 6 
<211> 265 
<212> PRT 

<213> Homo sapiens 
<400> 6 


Met 

Phe 

Gin 

Leu 

Leu 

He 

Pro 

Leu 

Leu 

Leu 

Ala 

Leu 

Lys 

Gly 

His 

Ala 

1 




5 





10 





15 


Gin 

Asp 

Asn 

Pro 

Glu 

Asn 

Val 

Gin 

Cys 

Gly 

His 

Arg 

Pro 

Ala 

Phe 

Pro 




20 





25 





30 



Asn 

Ser 

Ser 

Trp 

Leu 

Pro 

Phe 

His 

Glu 

Arg 

Leu 

Gin 

Val 

Gin 

Asn 

Gly 



35 





40 





45 




Glu 

Cys 

Pro 

Trp 

Gin 

Val 

Ser 

He 

Gin 

Met 

Ser 

Arg 

Lvs 

His 

Leu 

Cys 


50 





55 





60 





Gly 

Gly 

Ser 

He 

Leu 

His 

Trp 

Trp 

Trp 

Val 

Leu 

Thr 

Ala 

Ala 

His 

Cys 

65 





70 





75 





80 

Phe 

Arg 

Arg 

Thr 

Leu 

Leu 

Asp 

Met 

Ala 

Val 

Val 

Asn 

Val 

Thr 

Val 

Val 





85 





90 





95 


Met 

Gly 

Thr 

Arg 

Thr 

Phe 

Ser 

Asn 

He 

His 

Ser 

Glu 

Arg 

Lys 

Gin 

Val 




100 





105 





110 



Gin 

Lys 

Val 

He 

He 

His 

Lys 

Asp 

Tyr 

Lys 

Pro 

Pro 

Gin 

Leu 

Asp 

Ser 



115 





120 





125 




Asp 

Leu 

Ser 

Leu 

Leu 

Leu 

Leu 

Ala 

Thr 

Pro 

Val 

Gin 

Phe 

Ser 

Asn 

Phe 


130 





135 





140 





Lys 

Met 

Pro 

Val 

Cys 

Leu 

Gin 

Glu 

Glu 

Glu 

Arg 

Thr 

Trp 

Asp 

Trp 

Cys 

145 





150 





155 





160 

Trp 

Met 

Ala 

Gin 

Trp 

Val 

Thr 

Thr 

Asn 

Gly 

Tyr 

Asp 

Gin 

Tyr 

Asp 

Asp 





165 





170 





175 


Leu 

Asn 

Met 

His 

Leu 

Glu 

Lys 

Leu 

Arg 

Val 

Val 

Gin 

He 

Ser 

Arg 

Lys 




180 





185 





190 



Glu 

Cys 

Ala 

Lys 

Arg 

Val 

Asn 

Gin 

Leu 

Ser 

Arg 

Asn 

Met 

He 

Cys 

Ala 



195 





200 





205 




Ser 

Asn 

Glu 

Pro 

Gly 

Thr 

Asn 

Gly 

He 

Phe 

Lys 

Gly 

Asp 

Ser 

Gly 

Ala 


210 





215 





220 





Pro 

Leu 

Val 

Cys 

Ala 

lie 

Tyr 

Gly 

Thr 

Gin 

Arg 

Leu 

Phe 

Gin 

-Val 

Gl-y- 

225 





230 





235 





240 

Val 

Phe 

Ser 

Gly 

Gly 

He 

Arg 

Ser 

Gly 

Ser 

Arg 

Gly Arg 

Pro 

Gly 

Met 





245 





250 





255 


Phe 

Val 

Ser 

Val 

Ala 

Gin 

Phe 

lie 

Pro 











260 





265 









<210> 7 
<211> 1028 
<212> DNA 

<213> Mus musculus 

<220> 
<221> CDS 

<222> (38) . . . (1000) 
<400> 7 

gtcagcctgg cctccaacac acagcacagc cagagcc atg ate ctg ccc tec ate 55 

Met He Leu Pro Ser He 
1 5 

ctg eta ctt gtt gec cac ace ctg gaa gca aat gtt gag tgt ggt gtg 103 
Leu Leu Leu Val Ala His Thr Leu Glu Ala Asn Val Glu Cys Gly Val 
10 15 20 


-8- 


ii'i o <p> "S;' -b -"(i ant.ru ini 
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aga ccc ctg tat gat age aga att caa tac tec agg ate ata gaa ggg 151 
Arg Pro Leu Tyr Asp Ser Arg lie Gin Tyr Ser Arg lie lie Glu Gly 
25 30 35 ' 

cag gag get gag ctg ggt gag ttt cca tgg cag gtg age att cag gaa 199 
Gin Glu Ala Glu Leu Gly Glu Phe Pro Trp Gin Val Ser lie Gin Glu 
40 45 50 

agt gac cac cat ttc tgc ggc ggc tec att etc agt gag tgg tgg ate 247 
Ser Asp His His Phe Cys Gly Gly Ser lie Leu Ser Glu Trp Trp lie 
55 60 65 70 

etc ace gtg gec cac tgc ttc tat get cag gag ctt tec cca aca gat 295 
Leu Thr Val Ala His Cys Phe Tyr Ala Gin Glu Leu Ser Pro Thr Asp 
75 80 85 

etc aga gtc aga gtg gga acc aat gac tta act act tea ccc gtg gaa 343 
Leu Arg Val Arg Val Gly Thr Asn Asp Leu Thr Thr Ser Pro Val Glu 
90 95 100 

eta gag gtc acc acc ata ate egg cac aaa ggc ttt aaa egg ctg aac 391 
Leu Glu Val Thr Thr lie lie Arg His Lys Gly Phe Lys Arg Leu Asn 
105 110 115 

atg gac aac gac att gee ttg ttg ctg eta gec aag ccc ttg gcg ttc 439 
Met Asp Asn Asp lie Ala Leu Leu Leu Leu Ala Lys Pro Leu Ala Phe 
120 125 130 

aat gag ctg acg gtg ccc ate tgc ctt cct etc tgg ccc gee cct ccc 487 
Asn Glu Leu Thr Val Pro lie Cys Leu Pro Leu Trp Pro Ala Pro Pro 
135 140 145 150 

age tgg cac gaa tgc tgg gtg gca gga tgg ggc gta acc aac tea act 535 
Ser Trp His Glu Cys Trp Val Ala Gly Trp Gly Val- Thr Asn Ser Thr 
155 160 165 

gac aag gaa tct atg tea acg gat ctg atg aag gtg ccc atg cgt ate 583 
Asp Lys Glu Ser Met Ser Thr Asp Leu Met Lys Val Pro Met Arg' lie 
170 175 180 

ata gag tgg gag gaa tgc tta cag atg ttt ccc age etc acc aca aac 631 
lie Glu Trp Glu Glu Cys Leu Gin Met Phe Pro Ser Leu Thr Thr Asn 
185 190 195 

atg ctg tgt gee tea tat ggt aat gag age tac gat get tgc cag ggt 679 
Met Leu Cys Ala Ser Tyr Gly Asn Glu Ser Tyr Asp Ala Cys Gin Gly 
200 205 210 

gac agt ggg gga ccg ctt gtc tgc acc aca gat cct ggc agt agg tgg 727 
Asp Ser Gly Gly Pro Leu Val Cys Thr Thr Asp Pro Gly Ser Arg Trp 
215 220 225 230 

tac cag gtg ggc ate ate age tgg ggc aag age tgt gga aaa aaa ggc 775 
Tyr Gin Val Gly lie lie Ser Trp Gly Lys Ser Cys Gly Lys Lys Gly 
235 240 245 

ttc cca ggg ata tat act gta ttg gca aag tat acc ctg tgg att gag 823 
Phe Pro Gly lie Tyr Thr Val Leu Ala Lys Tyr Thr Leu Trp lie Glu 
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250 


255 


260 


aaa ata gcc cag aca gag ggg aag ccc ctg gat ttt aga ggt cag age 
Lys lie Ala Gin Thr Glu Gly Lys Pro Leu Asp Phe Arg Gly Gin Ser 
265 270 275 

tec tct aac aag aag aaa aac aga cag aac aat cag etc tec aaa tec 
Ser Ser Asn Lys Lys Lys Asn Arg Gin Asn Asn Gin Leu Ser Lys Ser 
280 285 290 

cca gcc ctg aac tgc ccc caa age tgg etc ctg ccc tgt ctg ctg tec 
Pro Ala Leu Asn Cys Pro Gin Ser Trp Leu Leu Pro Cys Leu Leu Ser 
295 300 305 310 

ttt gca ctg ctt aga gcc ttg tec aac tgg aaa taaaacaatg cagtctctga 
Phe Ala Leu Leu Arg Ala Leu Ser Asn Trp Lys 
315 320 

tccaccct 

<210> 8 
<211> 321 
<212> PRT 

<213> Mus musculus 
<400> 8 

Met lie Leu Pro Ser lie Leu Leu Leu Val Ala His Thr Leu Glu Ala 

15 10 15 

Asn Val Glu Cys Gly Val Arg Pro Leu Tyr Asp Ser Arg lie Gin Tyr 

20 25 30 

Ser Arg He He Glu Gly Gin Glu Ala Glu Leu Gly Glu Phe Pro Trp 

35 40 45 

Gin Val Ser He Gin Glu Ser Asp His His Phe Cys Gly Gly Ser He 

50 55 60 

Leu Ser Glu Trp Trp lie Leu Thr Val Ala His Cys -Phe Tyr Ala Gin 
65 70 75 80 

Glu Leu Ser Pro Thr Asp Leu Arg Val Arg Val Gly Thr Asn Asp Leu 

85 90 95 

Thr Thr Ser Pro Val Glu Leu Glu Val Thr Thr He He Arg His Lys 

100 105 110 

Gly Phe Lys Arg Leu Asn Met Asp Asn Asp He Ala Leu Leu Leu Leu 

115 120 125 

Ala Lys Pro Leu Ala Phe Asn Glu Leu Thr Val Pro He Cys Leu Pro 

130 135 140 

Leu Trp Pro Ala Pro Pro Ser Trp His Glu Cys Trp Val Ala Gly Trp 
145 150 155 160 

Gly Val Thr Asn Ser Thr Asp Lys Glu Ser Met Ser Thr Asp Leu Met 

165 170 175 

Lys Val Pro Met Arg He He Glu Trp Glu Glu Cys Leu Gin Met Phe 

180 185 190 

Pro Ser Leu Thr Thr Asn Met Leu Cys Ala Ser Tyr Gly Asn Glu Ser 

195 200 205 

Tyr Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro Leu Val Cys Thr Thr 

210 215 220 

Asp Pro Gly Ser Arg Trp Tyr Gin Val Gly He He Ser Trp Gly Lys 
225 230 235 240 

Ser Cys Gly Lys Lys Gly Phe Pro Gly He Tyr Thr Val Leu Ala Lys 

245 250 255 

Tyr Thr Leu Trp He Glu Lys He Ala Gin Thr Glu Gly Lys Pro Leu 


260 265 270 

Asp Phe Arg Gly Gin Ser Ser Ser Asn Lys Lys Lys Asn Arg Gin Asn 

275 280 285 

Asn Gin Leu Ser Lys Ser Pro Ala Leu Asn Cys Pro Gin Ser Trp Leu 

290 295 300 

Leu Pro Cys Leu Leu Ser Phe Ala Leu Leu Arg Ala Leu Ser Asn Trp 
305 310 315 320 

Lys 


<210> 9 
<211> 1123 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (41) . . . (1096) 
<400> 9 

ggcctctgtc acccccgggc ccacagcaca gcccagggcc atg etc ctg ttc tea 55 

Met Leu Leu Phe Ser 
1 5 

gtg ttg ctg etc ctg tec ctg gtc acg gga act cag etc ggt cca egg 103 
Val Leu Leu Leu Leu Ser Leu Val Thr Gly Thr Gin Leu Gly Pro Arg 
10 15 20 

act cct etc cca gag get gga gtg get ate eta ggc agg get agg gga 151 
Thr Pro Leu Pro Glu Ala Gly Val Ala lie Leu Gly Arg Ala Arg Gly 
25 30 35 

gee cac cgc cct cag ccc cgt cat ccc ccc age cca gtc agt gaa tgt 199 
Ala- His Arg Pro Gin Pro Arg His Pro Pro Ser Pro Val Ser Glu Cys 
40 45 50 

ggt gac aga tct att ttc gag gga aga act egg tat tec aga ate aca 247 
Gly Asp Arg Ser lie Phe Glu Gly Arg Thr Arg Tyr Ser Arg lie Thr 
55 60 65 

ggg ggg atg gag gcg gag gtg ggt gag ttt ccg tgg cag gtg agt att 2 95 
Gly Gly Met Glu Ala Glu Val Gly Glu Phe Pro Trp Gin Val Ser lie 
70 75 80 85 

cag gca aga agt gaa cct ttc tgt ggc ggc tec ate etc aac aag tgg 343 
Gin Ala Arg Ser Glu Pro Phe Cys Gly Gly Ser lie Leu Asn Lys Trp 
90 95 100 

tgg att etc act gcg get cac tgc tta tat tec gag gag ctg ttt cca 391 
Trp lie Leu Thr Ala Ala His Cys Leu Tyr Ser Glu Glu Leu Phe Pro 
105 110 115 

gaa gaa ctg agt gtc gtg ctg ggg ace aac gac tta act age cca tec 43 9 
Glu Glu Leu Ser Val Val Leu Gly Thr Asn Asp Leu Thr Ser Pro Ser 
120 125 130 


atg gaa ata aag gag gtc gec age ate att ctt cac aaa gac ttt aag 
Met Glu lie Lys Glu Val Ala Ser lie lie Leu His Lys Asp Phe Lys 


487 


-11- 
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135 


140 


145 


aga gcc aac atg gac aat gac att gcc ttg ctg ctg ctg get teg ccc 535 
Arg Ala Asn Met Asp Asn Asp lie Ala Leu Leu Leu Leu Ala Ser Pro 
150 155 160 165 

ate aag etc gat gac ctg aag gtg ccc ate tgc etc ccc acg cag ccc 583 
lie Lys Leu Asp Asp Leu Lys Val Pro lie Cys Leu Pro Thr Gin Pro 
170 175 180 

ggc cct gcc aca tgg cgc gaa tgc tgg gtg gca ggt tgg ggc cag acc 631 
Gly Pro Ala Thr Trp Arg Glu Cys Trp Val Ala Gly Trp Gly Gin Thr 
185 190 195 

aat get get gac aaa aac tct gtg aaa acg gat ctg atg aaa gcg cca 679 
Asn Ala Ala Asp Lys Asn Ser Val Lys Thr Asp Leu Met Lys Ala Pro 
200 205 210 

atg gtc ate atg gac tgg gag gag tgt tea aag atg ttt cca aaa ctt 727 
Met Val lie Met Asp Trp Glu Glu Cys Ser Lys Met Phe Pro Lys Leu 
215 220 225 

acc aaa aat atg ctg tgt gcc gga tac aag aat gag age tat gat gcc 775 
Thr Lys Asn Met Leu Cys Ala Gly Tyr Lys Asn Glu Ser Tyr Asp Ala 
230 235 240 245 

tgc aag ggt gac agt ggg ggg cct ctg gtc tgc acc cca gag cct ggt 823 
Cys Lys Gly Asp Ser Gly Gly Pro Leu Val Cys Thr Pro Glu Pro Gly 
250 255 260 

gag aag tgg tac cag gtg ggc ate ate age tgg gga aag age tgt gga 871 
Glu Lys, Trp Tyr Gin Val Gly lie lie Ser Trp Gly Lys Ser Cys Gly 
265 270 275 

gat aag aac acc cca ggg ata tac acc teg ttg gtg aac tac aac etc 919 
Asp Lys Asn Thr Pro Gly lie Tyr Thr Ser Leu Val Asn Tyr Asn Leu 
280 285 290 

tgg ate gag aaa gtg acc cag eta gga ggc agg ccc ttc aat gca gag 967 
Trp lie Glu Lys Val Thr Gin Leu Gly Gly Arg Pro Phe Asn Ala Glu 
295 300 305 

aaa agg agg act tct gtc aaa cag aaa cct atg ggc tec cca gtc teg 1015 
Lys Arg Arg Thr Ser Val Lys Gin Lys Pro Met Gly Ser Pro Val Ser 
310 315 320 325 

gga gtc cca gag cca ggc age ccc aga tec tgg etc ctg etc tgt ccc 1063 
Gly Val Pro Glu Pro Gly Ser Pro Arg Ser Trp Leu Leu Leu Cys Pro 
330 335 340 

ctg tec cat gtg ttg ttc aga get att ttg tac tgataataaa atagaggcta 1116 
Leu Ser His Val Leu Phe Arg Ala lie Leu Tyr 
345 350 

ttctttc 1123 

<210> 10 
<211> 352 
<212> PRT 
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<213> Homo sapiens 


<400> 10 


Met 

Leu 

Leu 

Phe 

Ser 

Val 

Leu 

Leu 

Leu 

Leu 

Ser 

Leu 

Val 

Thr 

Gly 

Thr 

1 




5 





10 





15 


Gin 

Leu 

Gly 

Pro 

Arg 

Thr 

Pro 

Leu 

Pro 

Glu 

Ala 

Gly 

Val 

Ala 

He 

Leu 




20 





25 





30 



Gly Arg 

Ala 

Arg 

Gly 

Ala 

His 

Arg 

Pro 

Gin 

Pro 

Arg 

His 

Pro 

Pro 

Ser 



35 





40 





45 




rl \J 

\Tpi 1 

V d -L 

Ser 

Glu 

Cys 

Gly 

Asp 

Arg 

Ser 

He 

Phe 

Glu 

Gly 

Arg 

Thr 

Arg 


5 0 





55 





60 






Del 

Arg 

lie 

Thr 

Gly 

Gly 

Met 

Glu 

Ala 

Glu 

Val 

Gly 

Glu 

Phe 

Pro 

65 





70 





75 





80 


Gin 

Val 

Ser 

lie 

Gin 

Ala 

Arg 

Ser 

Glu 

Pro 

Phe 

Cys 

Gly 

Gly 

Ser 




85 





90 





95 


lie 

i-j t: Ll 

Asn 

Lys 

Trp 

Trp 

He 

Leu 

Thr 

Ala 

Ala 

His 

Cys 

Leu 

Tyr 

Ser 




100 





105 





110 



Pin 

Pill 

Leu 

Phe 

Pro 

Glu 

Glu 

Leu 

Ser 

Val 

Val 

Leu 

Gly 

Thr 

Asn 

Asp 



115 





120 





125 




J-jGU 

J. 11X 

Ser 

Pro 

Ser 

Met 

Glu 

He 

Lys 

Glu 

Val 

Ala 

Ser 

He 

He 

Leu 


13 0 





135 





140 





Ui a 
n ± o 

Ly s 

Asp 

Phe 

Lys 

Arg 

Ala 

Asn 

Met 

Asp 

Asn 

Asp 

He 

Ala 

Leu 

Leu 

145 





150 





155 





160 


Leu 

Ala 

Ser 

Pro 

lie 

Lys 

Leu 

Asp 

Asp 

Leu 

Lys 

Val 

Pro 

He 

Cys 





165 





170 





175 



Pro 

Thr 

Gin 

Pro 

Gly 

Pro 

Ala 

Thr 

Trp 

Arg 

Glu 

Cys 

Trp 

Val 

Ala 




180 





185 





190 




Tr"r> 

Gly 

Gin 

Thr 

Asn 

Ala 

Ala 

Asp 

Lys 

Asn 

Ser 

Val 

Lys 

Thr 

Asp 



195 





200 





205 





Met 

Lys 

Ala 

Pro 

Met 

Val 

He 

Met 

Asp 

Trp 

Glu 

Glu 

Cys 

Ser 

Lys 


210 





215 





220 





Met 

Phe 

Pro 

Lys 

Leu 

Thr 

Lys 

Asn 

Met 

Leu 

Cys 

Ala 

Gly 

Tyr 

Lys 

Asn 

225 





230 





235 





240 

Glu 

Ser 

Tyr 

Asp 

Ala 

Cys 

Lys 

Gly 

Asp 

Ser 

Gly 

Gly 

Pro 

Leu 

Val 

Cys 





245 





25 0- 





255 


Thr 

Pro 

Glu 

Pro 

Gly 

Glu 

Lys 

Trp 

Tyr 

Gln 

Val 

Gly 

He 

He 

Ser 

Trp 




260 





265 





270 



Gly 

Lys 

Ser 

Cys 

Gly 

Asp 

Lys 

Asn 

Thr 

Pro 

Gly 

He 

Tyr 

Thr 

Ser 

Leu 



275 





280 





285 




Val 

Asn 

Tyr 

Asn 

Leu 

Trp 

He 

Glu 

Lys 

Val 

Thr 

Gin 

Leu 

Gly 

Gly 

Arg 


290 





295 





300 





Pro 

Phe 

Asn 

Ala 

Glu 

Lys 

Arg 

Arg 

Thr 

Ser 

Val 

Lys 

Gin 

Lys 

Pro 

Met 

305 





310 





315 





320 

Gly 

Ser 

Pro 

Val 

Ser 

Gly 

Val 

Pro 

Glu 

Pro 

Gly 

Ser 

Pro 

Arg 

Ser 

Trp 




325 





330 





335 


Leu 

Leu 

Leu 

Cys 

Pro 

Leu 

Ser 

His 

Val 

Leu 

Phe 

Arg 

Ala 

He 

Leu 

Tyr 




340 





345 





350 




<210> 11 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "76A5sc2-B", an artificially synthesized primer 
sequence 


<400> 11 


gatcmacagg tgccagtcat ca 


<210> 12 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT SP6 " , an artificially synthesized primer 
sequence 

<400> 12 

atttaggtga cactatagaa 

<210> 13 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT Fw" , an artificially synthesized primer 
sequence 

<400> 13 

tgtaaaacga cggccagt 

<210> 14 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "Sport RV" , an artificially synthesized primer 
sequence 

<400> 14 

caggaaacag ctatgacc 

<210> 15 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-C", an artificially synthesized primer 
sequence 

<400> 15 

atgcttctgc tatcgtggaa gg 

<210> 16 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "SPORT T7", an artificially synthesized primer 
sequence 


<400> 16 

taatacgact cactataggg 


20 


<210> 17 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-B", an artificially synthesized primer 
sequence 

<400> 17 

ctttgtgctg aggtcttcag tg 22 

<210> 18 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-G", an artificially synthesized primer 
sequence 


<210> 19 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-J" , an artificially synthesized primer 

sequence ----- - - - - - - 

<400> 19 

acttgccgtt ggtgcccact tc 22 

<210> 20 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-P", an artificially synthesized primer 
sequence 


<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-Q", an artificially synthesized primer 
sequence 


<400> 18 

cagtcaatgt cactgtggtc at 


22 


<400> 20 

gcactggaat gacaacatga tgc 


23 


-15- 


<400> 21 

attggcgtgg caagtaggag ca 

<210> 22 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-N", an artificially synthesized primer 
sequence 

<400> 22 

cgagtctccc agttagcaca ga 

<210> 23 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-M", an artificially synthesized primer 
sequence 

<400> 23 

cggtgacttg gtcatgtctg tg 

<210> 24 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-K", an artificially synthesized primer 
sequence - - 

<400> 24 

ggatccatga aacgatggaa ggacagaag 

<210> 25 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-0", an artificially synthesized primer 
sequence 

<400> 25 

cgcagagttc tgctcataca ta 

<210> 26 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "No9-A", an artificially synthesized primer 
sequence 

-16- 


<400> 26 

ggcatgtagc tcactggcat g 


<210> 27 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "29 (-)", an artificially synthesized primer 
sequence 

<400> 27 

ggaccagcaa gaatcagttc tg 

<210> 28 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "17 (+) 95 (+)", an artificially synthesized 
primer sequence 

<400> 28 

ctgctaccag ttctaatttg cc 

<210> 29 
<211> 22 
< 2 1 2 > DNA 

<213> Artificial Sequence 
<220> 

<223> "G3PDH 5' ", an artificially synthesized primer 
sequence - — 

<400> 29 

gagattgttg ccatcaacga cc' 

<210> 30 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "G3PDH 3' ", an artificially synthesized primer 
sequence 

<400> 30 

gttgaagtcg caggagacaa cc 

<210> 31 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-B", an artificially synthesized primer 
sequence 


litiiiJ ..... DiBO 


<400> 31 

agaggtcact gtcgagctgg g 


21 


<210> 32 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-D", an artificially synthesized primer 
sequence 

<400> 32 

tgtgaataat gaccttctgc ac 22 

<210> 33 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-A", an artificially synthesized primer 
sequence 


<210> 34 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-C", an artificially synthesized primer 

- - sequence - - — - - - 

<400> 34 

aagcaagtgc agaaggtcat ta 22 

<210> 35 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "h-F", an artificially synthesized primer 
sequence 


<210> 36 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "PR01-E", an artificially synthesized primer 
sequence 


<400> 33 

ttcagcaaca tccactcgga ga 


22 


<400> 35 

cattggtcgt tacccactgt gc 


22 


-18- 


<400> 36 

attctcaatg agtggtgggt tct 

<210> 37 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "PR01-D", an artificially synthesized primer 
sequence 

<400> 37 

ccagcacaca gcatattctt gg 

<210> 38 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-B", an artificially synthesized primer 
sequence 

<400> 38 

ggaaacagct cctcggaata taagc 

<210> 39 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-D", an artificially synthesized primer 
sequence — - - - 

<400> 39 

tggatgggct agttaagtcg ttggt 

<210> 40 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-A" , an artificially synthesized primer 
sequence 

<400> 40 

ttcgagggaa gaactcggta ttc 

<210> 41 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "hPR03-C", an artificially synthesized primer 
sequence 


<400> 41 

tgtgaaaacg gatctgatga aagcg 

<210> 42 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-B", an artificially synthesized primer 
sequence 

<400> 42 

cacctactgc caggatctgt gg 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-D", an artificially synthesized primer 
sequence 

<400> 43 

ggctattttc tcaatccaca gggta 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-A", an artificially synthesized primer 
sequence — - - - ■ 

<400> 44 

atagagtggg aggaatgctt acaga 

<210> 45 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> "mPR03-C", an artificially synthesized primer 
sequence 

<400> 45 

gctacgatgc ttgccagggt g 

<210> 46 
<211> 12 
<212> PRT 

<213> Mus musculus 


<400> 46 

Gly Lys Cys Gin Gly Asp Ser Gly Ala Pro Met Val 
1 5 10 


<210> 47 
<211> 12 
<212> PRT 

<213> Artificial Sequence 


<220> 

<22 3> derived from Homo sapiens and Mus musculus 

<221> VARIANT 
<222> 1 

<223> Xaa=Asp, Asn, Ser, Thr, Ala, Gly, or Cys . 

<221> VARIANT 
<222> 2 

<223> Xaa=Gly, Ser, Thr, Ala, Pro, lie, Met, Val, Gin, 
or His. 


<221> VARIANT 
<222> 3 

<223> Xaa=any mino acid 

<221> VARIANT 
<222> 4 

<223> Xaa=any amino acid 

<221> VARIANT 
<222> 6 

<223> Xaa=Asp or Glu 

<2 21> VARIANT 
<222> (9) ... (9) 
<223> Xaa=Gly or Ser. 


<221> VARIANT - - 
<222> (10) ... (10) 

<223> Xaa=Ser, Ala, Pro, His, or Val. 


<221> VARIANT 
<222> (11) ... (11) 

<223> Xaa=Leu, lie, Val, Met, Phe, Tyr, Trp, or His. 


<2 21> VARIANT 
<222> (12) ... (12) 

<223> Xaa=Leu, lie, Val, Met, Phe, Tyr, Ser, Thr, Ala, 
Asn, Gin, or His . 


<400> 47 

Xaa Xaa Xaa Xaa Gly Xaa Ser Gly Xaa Xaa Xaa Xaa 
15 10 


<210> 48 
<211> 12 
<212> PRT 

<213> Homo sapiens 


<400> 48 

Gly lie Phe Lys Gly Asp Ser Gly Ala Pro Leu Val 


10 


<210> 49 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> derived from Homo sapiens and Mus musculus 

<221> VARIANT 
<222> 1 

<223> Xaa=Leu, lie, Val, or Met. 

<221> VARIANT 
<222> 2 

<223> Xaa=Ser or Thr . 

<221> VARIANT 
<222> 4 

<223> Xaa=Ser, Thr, Ala, or Gly. 
<400> 49 

Xaa Xaa Ala Xaa His Cys 
1 5 


<210> 50 
<211> 6 
<212> PRT 

<213> Mus musculus 
<400> 50 

Leu Thr Val -Ala His Cys 
1 5 


<210> 51 
<211> 343 
<212> PRT 

<213> Homo sapiens 


<400> 51 


Met 

Ala 

Gin 

Lys 

Gly 

Val 

Leu 

Gly 

Pro 

Gly 

Gin 

Leu 

Gly 

Ala 

Val 

Ala 

1 




5 





10 





15 


He 

Leu 

Leu 

Tyr 

Leu 

Gly 

Leu 

Leu 

Arg 

Ser 

Gly 

Thr 

Gly 

Ala 

Glu 

Gly 




20 





25 





30 



Ala 

Glu 

Ala 

Pro 

Cys 

Gly Val 

Ala 

Pro 

Gin 

Ala 

Arg 

He 

Thr 

Gly 

Gly 



35 





40 





45 




Ser 

Ser 

Ala 

Val 

Ala 

Gly 

Gin 

Trp 

Pro 

Trp 

Gin 

Val 

Ser 

He 

Thr 

Tyr 


50 





55 





60 





Glu 

Gly 

Val 

His 

Val 

Cys 

Gly 

Gly 

Ser 

Leu 

Val 

Ser 

Glu 

Gin 

Trp 

Val 

65 





70 





75 





80 

Leu 

Ser 

Ala 

Ala 

His 

Cys 

Phe 

Pro 

Ser 

Glu 

His 

His 

Lys 

Glu 

Ala 

Tyr 





85 





90 





95 


Glu 

Val 

Lys 

Leu 

Gly 

Ala 

His 

Gin 

Leu 

Asp 

Ser 

Tyr 

Ser 

Glu 

Asp 

Ala 




100 





105 





110 



Lys 

Val 

Ser 

Thr 

Leu 

Lys 

Asp 

He 

He 

Pro 

His 

Pro 

Ser 

Tyr 

Leu 

Gin 


-22- 




1 1 R 





19 0 





1 OR 
X -J 




P 1 n 

uiy 



uiy 

ASp 

X X fc: 

A 1 -n 
nla 

Leu 

Leu 

Pin 

Leu 

Cor 
0 cl 

Ax y 

Irl \J 

Tip 
lie 







1 "3 R 

1 O D 





1 AO 

14U 





1 XIX 

PVl o 


Arg 

iyr 

Tip 

11c 

Arg 

JrX U 

Tip 
lit: 

P\7C; 
v_ J/ 0 

Leu 

irl U 

Ala 
n 1 a 


A C3T1 

Ala 

rtl a 

1 4 R 





1 SO 





155 





16 0 

C p >- 

D tiX 

Ir lltr 


A n 

Civ 

Lei_i 

U-i c: 

n x o 


TVi T" 
1 11X 

vox 

Thr 

uiy 


Pl V 

uiy 

Hi s 

Val 










17 0 

X / 





1 7S 
1 / -j 


Ala 

Axel 

ir X (J 

OCX 

vdl 

Qor- 

.Leu 

Leu 

± ill 

irl U 

Lys 

ir L U 

Leu 

Pin 
bill 

Pi n 
bill 

Leu 

fin 




JL O \J 





1 RR 
1 O -J 





190 



val 

rlO 

Leu 

Tic 

C -_iv- 

Arg 

pi n 

ul Ll 

1 Xlx 

Pwo 

Asn 

tyb 

Leu 

iyr 

Asn 

1 1 tr 

Asp 



X 27 -) 





9 n o 

z u u 





9 0S 

Z U -J 




Alcl 

Lys 

r^i. o 

PI,, 


IT 1 (J 

nib 

It XI" 

Veil 

PI n 

pin 

ulU 

Asp 

l v lfc: L 

Vdl 

Pwg 
^y 1=> 

Ala 


9 1 n 
z ± u 





/l j 





9 9 0 
z z u 





u±y 

Tyr 

val 

Pin 

v_._L.y 

Pi ,r 

Lys 

Asp 

Ala 
Ala 


Pin 
^jlll 

Pi u 

uiy 

Asp 


P 1 \7 

uiy 

uiy 

0 o <^ 

Z ZD 





6 JU 





9 ^ R 





9/(1 
z 1 u 

Pro 

Leu 

ofc: L 

^y _=> 

rXO 

Veil 

pi n 

blU 

uxy 

Leu 

irp 

iyr 

Leu 

X 111 

P 1 \r 

uiy 

Tip 

lit: 

val 





9 /I R 





9 R 0 

Z J u 





9 R R 

Z «J J 



i rp 

\j ly 

A c: i~ i 

Ala 

V— y o 

uiy 

A 1 ^ 

rll a. 

A rrr 
-r\i y 

A c;n 

X i O 1 1 

Arg 

XT 1 \J 

Pi \/ 

uiy 

Val 

iyi 

Thr 




Z O U 





9 R 
Z O J 





9 7 0 
z / u 



Leu 

Ala 

Ser 

Ser 

Tyr 

Ala 

Ser 

Trp 

He 

Gin 

Ser 

Lys 

Val 

Thr 

Glu 

Leu 



9 7 R 
Z / 3 





Ton 

z o u 





TOR 
Z O J 




Gin 

Pro 

Arg 

Val 

Val 

Pro 

Gin 

Thr 

Gin 

Glu 

Ser 

Gin 

Pro 

Asp 

Ser 

Asn 


290 





9QC 

Z Z? D 










Leu 

Cys 

Gly 

ofcJX 

Hie 
nib 

.Lie u 

rtl cl 

Jr lit: 

Ocl 


JtlI CL 

rl U 


Pin 

pi v 

uiy 

Leu 

305 





ti n 

j) x u 





^ 1 s 

J 1 J 





J Zi VJ 

Leu 

Arg 

Pro 

He 

Leu 

Phe 

Leu 

Pro 

Leu 

Gly 

Leu 

Ala 

Leu 

Gly 

Leu 

Leu 





325 





330 





335 


Ser 

Pro 

Trp 

Leu 

Ser 

Pin 
oiu 

ni o 













340 













<210> 52 














<211> 436 














<212> PRT 














<213> Mus musculus" " 












<400> 52 














Met 

Val 

Glu 

Met 

Leu 

irX \J 

1 i IX 

Va 1 

V CL 1 

Al ^ 
1 d 

vdi 

LGU 

veil 

T .01 1 

Ala 

Veil 

Cpr 

1 




5 





1 0 

1 \j 





15 


Val 

Val 

Ala 

Lys 

Asp 

Asn 

X 11X 

1 ill 

V— y 0 

A cn 
rtb{J 

P 1 V 

uiy 

rl KJ 

v^y 

uiy 

Leu 

A ttt 

Ai y 




20 





25 





3 0 



Phe 

Arg 

Gin 

Asn 

Ser 

pi n 

ulll 

nla 

P 1 \/- 

1 111 

A y~rr 
Al y 

Tl p 

11c 

V cl 1 

OCX 

rl v 

uiy 

m n 

Cpr 

OCX 



35 





40 





45 




Ala 

Gin 

Leu 

Gly 

Ala 

irp 

rlO 

i rp 

X v lt3 U 

vdl 

Qnr 

Ocl 

Leu 

Pin 

uXi.1 

Tip 

lit; 

it i ifcr 

1 in 


50 





R R 





0 





Ser 

His 

Asn 

Ser 

Arg 

Arg 

Tyr 

H 1 S 

Ala 
Aid. 

v^y ib 

P 1 \r 

uiy 

Pi \r 

uiy 

Ocl 

Leu 

Leu 

Asn 

65 





7 n 

' V 










ft 0 

Ser 

His 

Trp 

Val 

Leu 

1 IlX 

A 1 -n 

Ala 

rll s 

tyb 

i Iltr 

Asp 

Asn 

Lys 

Lys 

Lys 





85 





Q 0 





Q R 


Val 

Tyr 

Asp 

Trp 

Arg 

T 

Leu. 


Fne 

biy 

A 1 ^3 

Ala 

bin 

blU 

lie 

ulU 

Tyr 

nl tt 

biy 




100 





105 





no 



Arg 

Asn 

Lys 

Pro 

Val 

Lys 

Glu 

Pro 

Gin 

Gin 

Glu 

Arg 

Tyr 

Val 

Gin 

Lys 



115 





120 





125 




lie 

Val 

He 

His 

Glu 

Lys 

Tyr 

Asn 

Val 

Val 

Thr 

Glu 

Gly 

Asn 

Asp 

He 


130 





135 





140 





Ala 

Leu 

Leu 

Lys 

lie 

Thr 

Pro 

Pro 

Val 

Thr 

Cys 

Gly 

Asn 

Phe 

He 

Gly 

145 





150 





155 





160 

Pro 

Cys 

Cys 

Leu 

Pro 

His 

Phe 

Lys 

Ala 

Gly 

Pro 

Pro 

Gin 

He 

Pro 

His 





165 





170 





175 



-23- 


TV) r~ 
± nx 

\— y o 

lyx 

V CL X 

TV) y~ 
1 ilx 

P 1 \7 

b ly 

irp 


iyr 

Tic, 

Lys 

C 1 n 
bill 

Lys 

Ala 

Pro 

Arg 




180 





IPS 

J- O J 





1 _/ u 



IT X (J 


IT .L U 

V ct X 

Le u 

Flfc: L, 

pin 

bl U 

Hid 

Arg 


Asp 

Leu 

lie 

Asp 

Leu 

Asp 



195 





^ VJ 





9 n r 

ZU J 





y o 

Asn 

Del 

TVi -r- 
X XIX 

Pin 
bill 

i rp 

Tyr 

Asn 

biy 

Arg 

vai 

Thr 

Ser 

Thr 

Asn 


210 





9 1 R 
Z 1 J 





ZzU 





Veil 

v^y 



Tt 

iyr 

Jrx (J 

PI 11 
bl U 

Pi w 

biy 

Lys 

Tie 

lie 

Asp 

Thr 

Cys 

bin 

uiy 

Asp 

9 9 R 





9 ^ n 

ZjU 





ZJJ 





z4U 

c pr 

O tri- 

pi \/ 


Jr X tj 

Leu 

Mnt- 

i v it; u 

Lyb 

Arg 

Asp 

Asn 

Val 

Asp 


Pro 

rfie 

val 





J 





ZjU 





Add 


V Cl-L 

V Cl-L 


Tl ^ 
X X tr 

TV) r~ 
X 1 IX 

QfciX. 

irp 

Pl \7 

biy 

veil 

Pi 

biy 

Lys 

a 1 =a 
Ala 

Arg 

Aid 

Lys 

Arg 




9 n 
o u 





9 S 
Z O D 





9 "7 n 
z / u 



Jrx O 

b ly 

veil 

lyr 

TVi t- 
1 Ilx 

A 1 -a 

Alcl 

TVi t* 

1 Ilx 

irp 

Asp 

iyr 

Leu 

Asp 

Trp 

T 1 Q 

lie 

Ala 




9 7 R 





oon 
ZoU 





not: 
Z o D 




T 

Lys 

Tl ^ 

X X tr 

P 1 \7 

biy 

Jr X O 

ash 

Ala. 

-Leu. 

TJ -i o 

nlS 

Leu 

lie 

bin 

Pro 

Ala 

Thr 

Pro 

HIS 


9 Q n 

Z J u 





Z zJ D 





inn 





Jrx O 

Jr X O 

J. I1X 

1 Ilx 

Arg 

JrlX S 

Pro 

i v iec 

val 

C r~i v- 

ber 

vne 

Hi S 

Pro 

Pro 

Ser 

Leu 

3 05 





~\ 1 n 





•31 C 
Jlj 





ion 
o / U 

/\x 9 

rl U 

r I LJ 

i rp 

iyr 

irne 

Pin 

bin 

U-J r. 

Leu 

Pro 

O ^ -v- 

ber 

Arg 

Pro 

Leu 

Tyr 

Leu 





9 R 





JjU 





JJ J 


Arg 

Jrx v_> 

Leu 

Arg 

Jrx O 

Lgu 

Leu 

nlS 

Arg 

Pro 

Ser 

Ser 

Thr 

pin 
bin 

i nr 

ber 




^ a n 





J 44 D 





"3RD 




Ccv 

Leu 

jyieu 

Pro 

Leu 

Leu 

Ser 

Pro 

Pro 

Thr 

Pro 

A 1 ~. 

Ala 

bin 

Pro 

Ala 



a a 
ODD 





3 60 





3 65 




Ser 

Phe 

Thr 

He 

Ala 

Thr 

Gin 

His 

Met 

Arg 

His 

Arg 

Thr 

Thr 

Leu 

Ser 


370 





375 





380 





Phe 

Ala 

Arg 

Arg 

Leu 

Gin 

Arg 

Leu 

He 

Glu 

Ala 

Leu 

Lys 

Met 

Arg 

Thr 

385 





390 





395 





400 

Tyr 

Pro 

Met 

Lys 

His 

Pro 

Ser 

Gin 

Tyr 

Ser 

Gly 

Pro 

Arg 

Asn 

Tyr 

His 





405 





410 





415 


Tyr 

Arg 

Phe 

Ser 

Thr 

Phe 

Glu 

Pro 

Leu 

Ser 

Asn 

Lys 

Pro 

Ser 

Glu 

Pro 




420 





425 





430 



Phe 

Leu 

His 

Ser 














<210> 53 
<211> 246 
<212> PRT 

<213> Mus musculus 


<400> 53 


Met 

Ser 

Ala 

Leu 

Leu 

He 

Leu 

Ala 

Leu 

Val 

Gly 

Ala 

Ala 

Val 

Ala 

Phe 

1 




5 





10 





15 


Pro 

Val 

Asp 

Asp 

Asp 

Asp 

Lys 

He 

Val 

Gly 

Gly 

Tyr 

Thr 

Cys 

Arg 

Glu 




20 





25 





30 



Ser 

Ser 

Val 

Pro 

Tyr 

Gin 

Val 

Ser 

Leu 

Asn 

Ala 

Gly 

Tyr 

His 

Phe 

Cys 



35 





40 





45 




Gly 

Gly 

Ser 

Leu 

He 

Asn 

Asp 

Gin 

Trp 

Val 

Val 

Ser 

Ala 

Ala 

His 

Cys 


50 





55 





60 





Tyr 

Lys 

Tyr 

Arg 

He 

Gin 

Val 

Arg 

Leu 

Gly 

Glu 

His 

Asn 

He 

Asn 

Val 

65 





70 





75 





80 

Leu 

Glu 

Gly 

Asn 

Glu 

Gin 

Phe 

Val 

Asp 

Ser 

Ala 

Lys 

He 

He 

Arg 

His 





85 





90 





95 


Pro 

Asn 

Tyr 

Asn 

Ser 

Trp 

Thr 

Leu 

Asp 

Asn 

Asp 

He 

Met 
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